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GLOBAL ANALYSIS OF PROTEIN ACTIVITIES 
USING PROTEOME CHIPS 

5 

This invention was made with Government support under grant numbers CA77808 
and GM62480 awarded by the National Institutes of Health. The Government may have 
certain rights in the invention. 

1 0 RELATED APPLICATIONS 

This application claims benefit of United States provisional application nos. 
60/290,583, filed on May 11, 2001, and 60/308,149, filed on July 26, 2001, each of which is 
incorporated herein by reference in its entirety. 

15 1. FIELD OF THE INVENTION 

The present invention relates to proteome chips comprising arrays having a large 
proportion of all proteins expressed in a single species. The invention also relates to 
methods for making proteome chips. The invention also relates to methods for using 
20 proteome chips to systematically assay all protein interactions in a species in a 
high-throughput manner. 

The present invention also relates to methods for making and purifying eukaryotic 
proteins in a high-density array format. The invention also relates to methods for making 
protein arrays by attaching double-tagged fusion proteins to a solid support. The invention 
25 also relates to a method for identifying whether a signal is positive. 

2. BACKGROUND OF THE INVENTION 

A daunting task in the post-genome sequencing era is to understand the functions, 
30 modifications, and regulation of every protein encoded by a genome (Fields et aL, 1999, 
Proc Natl Acad Sci. 96:8825; Goffeau et aL, 1996, Science 274:563). Currently, much 
effort is devoted toward studying gene, and hence protein, function by analyzing mRNA 
expression profiles, gene disruption phenotypes, two-hybrid interactions, and protein 
subcellular localization (Ross-Macdonald et aL, 1999, Nature 402:413; DeRisi et aL, 1997, 
35 Science 278:680; Winzeler et aL, 1999, Science 285:901; Uetz et aL, 2000, Nature 403:623; 
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Ito et al., 2000, Proc. Natl. Acad. Sci. U.S.A. 97:1 143). Although these studies are useful, 
transcriptional profiles do not necessarily correlate well with cellular protein levels. Thus, 
the analysis of biochemical activities can provide information about protein function that 
complements genomic analyses to provide a more complete picture of the workings of a cell 

5 (Zhu et al., 2001, Curr. Opin. Chem. Biol. 5:40; Martzen, et al., 1999, Science 286:1 153; 
Zhu et al, 2000, Nat. Genet. 26:283; MacBeath, 2000, Science 289:1760; Caveman, 2000, 
J. Cell Sci. 113:3543). 

Several groups have recently described microarray formats for the screening of 
protein activities (Zhu et al., 2000, Nat. Genet. 26:283; MacBeath et al., 2000, Science 

10 289:1763; Arenkov et al, 2000, Anal. Biochem 278:123). In addition, a collection of 
overexpression clones of yeast proteins have been prepared and screened for biochemical 
activities (Martzen et al., 1999, Science 286: 1 153). However, thousands of individual 
proteins approximating an entire proteome have not been prepared, arrayed, and screened 
for multiple activities (Caveman, 2000, J. Cell Sci. 113:3543) 

1 5 Screening an entire proteome would entail the systematic probing of biochemical 

activities of proteins that are produced in a high throughput fashion, and analyzing the 
functions of hundreds or thousands of proteins samples in parallel (Zhu et al., 2000, Nat. 
Genet. 26:283; MacBeath et al., 2000, Science 289:1763; Arenkov et al, 2000, Anal. 
Biochem 278:123). Attempts to screen an entire proteome array have encountered major 

20 obstacles, including the inability to generate the necessary expression clones, and to express 
and purify the expressed proteins in a high-throughput fashion. In vitro assays have 
previously been conducted using random expression libraries or pooling strategies, both of 
which have shortcomings (Martzen et al., 1999, Science 286:1 153; Bussow et al., 2000, 
Genomics 65:1). Specifically, random expression libraries are tedious to screen, and 

25 contain clones that are often not full-length. Another recent approach has been to generate 
defined arrays and screen the array using a pooling strategy (Martzen et al. 1999, Science 
286:1 153). The pooling strategy obscures the actual number of proteins screened, however, 
and the strategy is cumbersome when large numbers of positives are identified. 

Another method useful for detecting protein-protein interactions is the two-hybrid 

30 approach (Uetz et al., 2000, Nature 403:623; Ito et al., 2000, Proc. Natl. Acad. Sci. 
U.S.A.97: 1 143). The types of interactions that can be detected using this approach are 
limited, however, because the interactions are typically detected in the nucleus. 

Therefore, there remains a need in the art for the large-scale analysis of biochemical 
functions which would require preparing and screening, in a high-throughput manner, a 

35 comprehensive set of proteins encoded by a species's genome. 
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Citation or identification of any reference in Section 2, or in any other section of this 
application, shall not be considered an admission that such reference is available as prior art 
to the present invention. 

5 3. SUMMARY OF THE INVENTION 

The present invention provides proteome chips useful for the global study of the 
protein interactions of a species in a high-throughput manner. The methods and 
compositions of the invention are made possible by Applicants' new and unobvious 

10 discovery of a means of preparing a comprehensive set of expression constructs containing 
protein-coding sequences of a genome, producing the protein products in host cells in a 
high-throughput fashion, and analyzing the functions of a plurality of proteins in a 
high-throughput manner using microarrays. 

The present invention is directed to proteome chips, which are positionally 

15 addressable arrays comprising a plurality of proteins, with each protein being at a different 
position on a solid support, wherein the plurality of proteins represents a substantial 
proportion of all proteins expressed in a single species, wherein translation products of one 
open reading frame are considered a single protein. 

An advantage of using arrays, rather than performing one-by-one assays, is the 

20 ability to identify and characterize many protein-probe interactions simultaneously. 
Moreover, complex mixtures of probes can be contacted with a proteome chip to, for 
example, detect interactions in a milieu more representative of that in a cell, and to quickly 
evaluate many potential binding compounds. 

Accordingly, in one embodiment, the present invention provides a positionally 

25 addressable array comprising a plurality of proteins, with each protein being at a different 
position on a solid support, wherein the plurality of proteins comprises at least 1%, 2%, 3%, 
4%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or 99% of all proteins 
expressed in a single species. 

In another embodiment, the invention provides a positionally addressable array 

30 comprising a plurality of proteins, with each protein being at a different position on a solid 
support, wherein the plurality of proteins comprises at least 1%, 2%, 3%, 4%, 5%, 10%, 
20%>, 30%, 40% or 50% of all proteins expressed in a single species, wherein protein 
isoforms and splice variants are counted as a single protein. In a specific embodiment, the 
plurality of proteins comprises at least 50% of all proteins expressed in a single species, 

35 wherein protein isoforms and splice variants are counted as a single protein. 
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In another embodiment, the present invention provides a positionally addressable 
array comprising a plurality of proteins, with each protein being at a different position on a 
solid support, wherein the plurality of proteins comprises at least 1, 2, 3, 4, 5, 10, 20, 30, 40, 
50, 100, 200, 500, 1000, 1500, 2000, 2500, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 

5 10,000, 100,000, 500,000 or 1,000,000 protein(s) expressed in a single species. 

In another embodiment, the invention provides a positionally addressable array 
comprising a plurality of proteins, with each protein being at a different position on a solid 
support, wherein the plurality of proteins in aggregate comprise proteins encoded by at least 
1000 different known genes in a single species. 

10 In a further embodiment, the proteins are organized on the array according to a 

classification of proteins. The classification can be by abundance, function, functional 
class, enzymatic activity, homology, protein family, association with a particular metabolic 
or signal transduction pathway, association with a related metabolic or signal transduction 
pathway, or posttranslational modification. 

15 In another embodiment, the invention provides a positionally addressable array as 

described above, wherein the solid support comprises glass, ceramics, nitrocellulose, 
amorphous silicon carbide, castable oxides, polyimides, polymethylmethacrylates, 
polystyrenes, or silicone elastomers. 

In a further embodiment, the solid support comprises a material that helps bind the 

20 plurality of proteins to the solid support. For example, the solid support can be coated with 
a material that binds to an affinity tag of each protein. In a particular embodiment, the solid 
support comprises glutathione. In another particular embodiment, the solid support coating 
comprises nickel or nitrocellulose. In another particular embodiment, the solid support 
coating comprises glutathione and nickel. In a one embodiment, the solid support is a 

25 nickel-coated glass slide. In a preferred embodiment, the solid support is a nitrocellulose- 
coated glass slide. Nitrocellulose-coated glass slides for making protein (and DNA) 
microarrays are commercially available (e.g., from Schleicher & Schuell (Keene, NH), 
which sells glass slides coated with a nitrocellulose based polymer (Cat. no. 10 484 182)). 
In a specific embodiment, each protein is spotted onto the nitrocellulose-coated glass slide 

30 using an OMNIGRID™ (GeneMachines, San Carlos, CA). 

Proteins on the proteome chips preferably are fusion proteins comprising at least one 
affinity tag useful for purifying and/or attaching the proteins to the proteome chip. 

The present invention also provides methods for making proteome chips. 
Accordingly, the invention provides a method for constructing a positionally addressable 

35 array comprising the step of attaching a plurality of proteins to a surface of a solid support, 
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with each protein being at a different position on the solid support, wherein the plurality of 
proteins comprises at least 1%, 2%, 3%, 4%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 
80%, 90%, 95%, or 99% of all proteins expressed in a single species. 

In one embodiment, the invention provides a method for making a positionally 

5 addressable array comprising the step of attaching a plurality of proteins to a surface of a 
solid support, with each protein being at a different position on the solid support, wherein 
the plurality of proteins comprises at least 1%, 2%, 3%, 4%, 5%, 10%, 20%, 30%, 40%, or 
50% of all proteins expressed in a single species, wherein protein isoforms and splice 
variants are counted as a single protein. 

10 In another embodiment, the present invention provides a method for constructing a 

positionally addressable array comprising the step of attaching a plurality of proteins to a 
surface of a solid support, with each protein being at a different position on the solid 
support, wherein the plurality of proteins comprises at least 1, 2, 3, 4, 5, 10, 20, 30, 40, 50, 
100, 200, 500, 1000, 1500, 2000, 2500, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10,000, 

15 100,000, 500,000 or 1,000,000 protein(s) expressed in a single species. 

In another embodiment, the present invention provides a method for making a 
positionally addressable array comprising the step of attaching a plurality of proteins to a 
surface of a solid support, with each protein being at a different position on the solid 
support, wherein the plurality of proteins in aggregate comprise proteins encoded by at least 

20 1000 different known genes in a single species. 

The present invention also provides methods for making and isolating viral, 
prokaryotic or eukaryotic proteins in a readily scalable format, amenable to high-throughput 
analysis. Preferred methods include synthesizing and purifying proteins in an array format 
compatible with automation technologies. Accordingly, in one embodiment, the invention 

25 provides a method for making and isolating viral, prokaryotic or eukaryotic proteins 
comprising the steps of growing a eukaryotic cell transformed with a vector having a 
heterologous sequence operatively linked to a regulatory sequence, contacting the regulatory 
sequence with an inducer that enhances expression of a protein encoded by the heterologous 
sequence, lysing the cell, contacting the protein with a binding agent such that a complex 

30 between the protein and binding agent is formed, isolating the complex from cellular debris, 
and isolating the protein from the complex, wherein each step is conducted in a 96-well 
format. 

The protein is preferably a fusion protein such that the heterologous sequence 
comprises the coding region for the protein of interest and sequences encoding a tag, such as 
35 an affinity tag. Such tags can be useful for monitoring the protein, separating the fusion 
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protein from cellular debris and contaminating reagents, and/or attaching the protein to a 
proteome chip of the invention. 

The present invention also provides methods for making a positionally addressable 
arrays comprising the step of attaching a plurality of fusion proteins to a surface of a solid 

5 support, with each protein being at a different position on the solid support, wherein the 
protein comprises a first tag, a second tag, and a protein encoded by a genomic nucleic acid 
of an organism. In certain embodiments, the protein is tagged with one tag at the amino- 
terminal end of the protein, and with a second, different tag at the carboxy-terminal end. In 
other embodiments, the protein is tagged with two tags at the amino-terminal end of the 

1 0 protein, or with two tags at the carboxy-terminal end. In yet other embodiments, the protein 
is tagged with one or more tags at site(s) on the protein other than the amino- or carboxy- 
terminal end. The advantages of using double-tagged proteins include the ability to obtain 
highly purified proteins, as well as providing a streamlined manner of purifying proteins 
from cellular debris and attaching the proteins to a solid support. 

1 5 Accordingly, in a particular embodiment, the first tag is a glutathione- S -transferase 

tag ("GST tag") and the second tag is a poly-histidine tag ("His tag"). In another 
embodiment, the GST tag and the His tag are attached to the amino-terminal end of the 
protein. Alternatively, the GST tag and the His tag are attached to the carboxy-terminal end 
of the protein. The GST tag and His tag can be found on either the amino-terminal or 

20 carboxy-terminal end of the protein. In certain embodiments, the GST tag is attached to the 
amino-terminal end of the protein and the His tag is attached to the carboxy-terminal end. 
In other embodiments, the His tag is attached to the amino-terminal end of the protein and 
the GST tag is attached to the carboxy-terminal end. 

Alternating the placement of an affinity tag of a fusion protein can lead to functional 

25 secondary structure, proper folding of extracellular domains, and appropriate trafficking, 
localization, and/or secretion of proteins. For example, fusion of a GST tag and a His tag 
onto the carboxy-terminal end of the protein can obviate inappropriate folding or expression 
when the regions upstream of the translational initiation codon are blocked. 

The present invention also provides methods of using a protein array to screen for 

30 lipid-binding proteins. Accordingly, in one embodiment, the invention is a method for 
using a positionally addressable array comprising a plurality of proteins, with each protein 
being at a different position on a solid support, comprising the steps of contacting a probe 
with the array, and detecting protein-probe interaction, wherein the probe comprises a lipid. 
In a particular embodiment, the lipid comprises a phospholipid such as, but not limited to, 

35 
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phosphatidylcholine and phosphatidylinositol. In another particular embodiment, the probe 
is in the form of a liposome containing phospholipids of interest. 

Also provided are methods for using proteome chips. The proteome chips of the 
invention can be used to assay for essentially all protein-protein interactions in a species or 

5 cell. The proteome chips can also be used to systematically assay all the proteins of a 
species that interact with a test compound. Therefore, using the proteome chips of the 
invention, a multitude of activities can be assayed to yield a wealth of information such as, 
but not limited to, defining a "fingerprint" or "signature" of a cell or organism in response to 
a stimulus, characterizing all proteins in a species that interact with a probe of interest, 

10 characterizing all proteins in two or more species that interact with a probe of interest, 
characterizing all proteins involved in a biological pathway (e.g., metabolic or signal 
transduction pathway) or in related biological pathways, characterizing all proteins in a 
species with enzymatic activity(ies) of interest (e.g., kinase activity, protease activity, 
phosphatase activity, glycosidase, acetylase activity, and other chemical group transferring 

1 5 enzymatic activity), characterizing all proteins in a species with posttranslational 
modification(s) of interest, and identifying drug targets. In a specific embodiment, 
proteome chips of the invention are used to characterize all proteins, e.g., drug targets, in a 
species that interact to with a drug or drug candidate of interest. 

Thus, the invention encompasses a method for detecting a binding protein 

20 comprising the steps of contacting a probe with a positionally addressable array comprising 
a plurality of proteins, with each protein being at a different position on a solid support, 
wherein the plurality of proteins comprises at least one protein encoded by at least 1%, 2%, 
3%, 4%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or 99% of the 
known genes in a single species, and detecting any protein-probe interaction. 

25 In another embodiment, the invention encompasses a method for detecting a 

binding protein comprising the steps of contacting a probe with a positionally addressable 
array comprising a plurality of proteins, with each protein being at a different position on a 
solid support, wherein the plurality of proteins comprises at least 1%, 2%, 3%, 4%, 5%, 
10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or 99% of all proteins expressed 

30 in a single species, wherein protein isoforms and splice variants are counted as a single 
protein, and detecting any protein-probe interaction. 

In another embodiment, the invention encompasses a method for detecting a binding 
protein comprising the steps of contacting a probe with a positionally addressable array 
comprising a plurality of proteins, with each protein being at a different position on a solid 

35 support, wherein the plurality of proteins comprises at least 1, 2, 3, 4, 5, 10, 20, 30, 40, 50, 
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100, 200, 500, 1000, 1500, 2000, 2500, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10,000, 
100,000, 500,000 or 1,000,000 known proteins expressed in a single species, and detecting 
any protein-probe interaction. 

In another embodiment, the invention encompasses a method for detecting a binding 

5 protein comprising the steps of contacting a probe with a positionally addressable array 
comprising a plurality of proteins, with each protein being at a different position on a solid 
support, wherein the plurality of proteins in aggregate comprises at least 1, 2, 3, 4, 5, 10, 20, 
30, 40, 50, 100, 200, 500, 1000, 5000, 10000, 20000, 30000, 40000, or 50000 different 
known genes in a single species, and detecting any protein-probe interaction. 

10 In another embodiment, the invention encompasses a method for detecting a binding 

protein comprising the steps of contacting a probe with a positionally addressable array 
comprising a plurality of proteins, with each protein being at a different position on a solid 
support, wherein the plurality of proteins in aggregate comprises at least 1, 2, 3, 4, 5, 10, 20, 
30, 40, 50, 100, 200, 500, 1000, 5000, 10000, 20000, 30000, 40000, or 50000 different 

1 5 known genes in at least two species, and detecting any protein-probe interaction. 

In another embodiment, the invention encompasses a method for detecting a binding 
protein comprising the steps of contacting a probe with a positionally addressable array 
comprising a plurality of fusion proteins, with each protein being at a different position on a 
solid support, wherein the fusion protein comprises a first tag, a second tag, and a protein 

20 sequence encoded by genomic nucleic acid of an organism, and detecting any protein-probe 
interaction. As described above, in certain embodiments, the two tags can be His and GST. 

The present invention also provides a method of labeling a protein for use in a 
binding assay, comprising the steps of contacting separate aliquots of the protein with a 
biotin-transferring compound under conditions and for a period of time to produce proteins 

25 that are biotinylated to differing degrees among the different aliquots, and combining 
together the different aliquots to produce a sample of differentially biotinylated protein. 

The present invention also provides a method for detecting a binding protein 
comprising the steps of contacting a sample of biotinylated protein produced by the method 
described above with a positionally addressable array comprising a plurality of proteins, 

30 with each protein being at a different position on a solid support, and detecting any 

positions on the array protein-probe interaction wherein interaction between a biotinylated 
protein and a protein on the array occurs. 

The present invention also provides a method for detecting a binding protein 
comprising the steps of contacting a sample of biotinylated proteins produced by the method 

35 described above with a positionally addressable array comprising a plurality of proteins, 



8 



WO 02/092118 



PCT/US02/14982 



with each protein being at a different position on a solid support, contacting the array with 
streptavidin conjugated to a fluor and detecting positions on the array at which the 
fluorescence occurs, wherein the fluorescence indicates that interaction between a 
biotinylated protein and a protein on the array occurs. 
5 The present invention also provides a method for identifying whether a signal is 

positive. Accordingly, one embodiment of the invention provides a method for identifying 
whether a signal obtained in an assay using a protein microarray is positive, indicating 
binding of a probe to an interactor. 

10 3.1 Definitions 

As used herein, the word "protein" refers to a full-length protein, a portion of a 
protein, or a peptide. Proteins can be produced via fragmentation of larger proteins, or 
chemically synthesized. Preferably, proteins are prepared by recombinant overexpression in 
15 a species such as, but not limited to, bacteria, yeast, insect cells, and mammalian cells. 

Proteins to be placed in a protein microarray of the invention preferably are fusion proteins, 
more preferably with at least one affinity tag to aid in purification and/or immobilization. 

As used herein, the word "interactor" refers to a protein on a protein microarray that 
interacts with a probe. 

20 As used herein, the word "probe" refers to any chemical reagent such as, but not 

limited to, a protein, nucleic acid (e.g., DNA, RNA, oligonucleotide, polynucleotide), small 
molecule, substrate, inhibitor, drug or drug candidate, receptor, antigen, hormone, steroid, 
lipid, phospholipid, liposome, antibody, cofactor, cytokine, glutathione, immunoglobulin 
domain, carbohydrate, maltose, nickel, dihydrotrypsin, calmodulin, biotin, lectin, and heavy 

25 metal, that can be applied to a protein microarray of the invention to assay for interaction 
with a protein of the microarray. 

4. BRIEF DESCRIPTION OF THE FIGURES 

30 FIGS. 1A-1B. The procedure of yeast proteome analysis using protein chip 

technology. 

A. The yeast ORFs were cloned into a double-tagged yeast GAL J expression vector 
via a recombination strategy and verified for correct identities by sequencing. Each pure 
plasmid construct was then reintroduced into a yeast strain for large-scale protein 
35 purification. Yeast cultures were grown in a 96-well format, and induced by adding 
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galactose. After the high-throughput purification step, the purified proteins were aliquoted 
and stored in a glycerol buffer at -80°C before printing. Using a high precision 
microarrayer, 6566 protein samples can be spotted in duplicate onto 80 slides in a single 
experiment. 

5 B. Immunoblot analysis of proteins purified from 3 -ml yeast cultures. 



FIGS, 2A-2C. A proteome chip probed with anti-GST antibodies. 
Sixty samples were examined by immunoblot analysis using anti-GST antibodies; 
1 9 representative examples are shown. Greater than 80% of the preparations produce high 
1 0 yields of fusion protein. 

A. GST::yeast fusion proteins were purified in a 96-well format. 6566 protein 
samples representing 5800 unique proteins were spotted in duplicate on a single 
nickel-coated microscope slide. The slide was probed with anti-GST antibodies. 

B. An enlarged image of one of the 48 blocks is depicted to the right of the 
15 proteome chip. The letters indicate the duplicated protein samples, and the numbers 

represent the source plate numbers. Note the excellent resolution and high signal-to-noise 
ratios. 

C. Distribution of errors between duplicated spots of each ORF product. The unit 
of s is the signal intensity determined by the Axon scanner. As determined by a GST 

20 control, ten units are equivalent to approximately 32 fg of protein. The center of the s 
distribution is 1 .2 units, and 95% of the samples are within 10 units from the center. 



FIGS. 3A-3B. Examples of different assays on the proteome chips. 

A. Proteome chips containing 6566 yeast proteins were spotted in duplicate and 
incubated with the biotinylated probes indicated. Positive interactors, indicated by boxes, 
were identified in six protein-liposome interaction assays ("PI(3)P", " PI(4,5)P 2 ", "PI(4)P", 
M PI(3,4)P 2 M , "PI(3,4,5)P 3 ", "PC"), a calmodulin-binding assay ("Calmodulin"), and a 
DNA-protein interaction assay ("Genomic DNA"). Each block contains 16X18 protein 
spots. The positive signals in duplicate appear as horizontal pairs in the bottom panels. The 
duplicate spotting serves as an internal control, which is important when the signals are 
weak relative to the background. The upper panels show the same yeast protein 
preparations of a control proteome chip probed with anti-GST antibodies ("a-GST"). As 
demonstrated by the figure, strong signals are often observed in samples having relatively 
low levels of GST fusion protein ("Probe"), indicating that the binding is sensitive and 
specific. PI, phosphatidylinositol. PC, phosphatidylcholine. 
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B. A putative calmodulin-binding motif identified by searching for amino acid 
sequences that are shared by the different calmodulin targets (Zhu et al. 2000, Nat. Genet. 
26:283). 14 of 39 positive proteins share a motif whose consensus is I/L- Q -X-K-K/X- G-B 
(SEQ ID NO: 1), where X is any residue and B is a basic residue. The size of the lettering, 

5 depicted above the alignment, indicates the relative frequency of the amino acid indicated. 
YFL003 C/MSH4 (SEQ ID NO: 2); YJR073C/OPI3 (SEQ ID NO: 3); YBR050C/REG2 
(SEQ ID NO: 4); YNL202W/SPS 1 9 (SEQ ID NO: 5); YOL016C/CMK2 (SEQ ID NO: 6); 
BR01 1C/IPP1 (SEQ ID NO: 7); YGR034W/RPL26B (SEQ ID NO: 8); YFR004 W/RPN 1 1 
(SEQ ID NO: 9); YIL021 W/RPB3 (SEQ ID NO: 10); YGL063W/PUS2 (SEQ ID NO: 1 1); 

10 YDR292C/SRP101 (SEQ ID NO: 12); YFR0 1 4C/CMK 1 (SEQ ID NO: 13); 
YBR2 1 3 W/MET8 (SEQ ID NO: 14); YAL029C/MYO4 (SEQ ID NO: 15). 

FIGS. 4A-4D. Analysis of the phosphatidylinositol-binding proteins. To determine 
the phosphatidylinositol ("PI")-binding specificity of 150 positive proteins, their binding 

15 signals were normalized against the corresponding binding signals of phosphatidylcholine 
("PC"). Based on the ratios ("PI/PC"), the proteins were grouped into four categories: (A) 
30 strong and specific, (B) 43 strong and nonspecific, (C) 19 weak and specific, and (D) 58 
weak and nonspecific phosphatidylinositol-binding proteins. The intensity represents the 
PI/PC signal ratio as shown by the scale in the figure. The column, labeled "10 n " to the 

20 right of the PI/PC binding ratios indicates the maximum binding signal intensity 

(open boxes) and its confidence interval (solid horizontal lines); the numbers indicate the 
log of the values. Boxes in the three columns to the right of the confidence interval column 
indicate membrane-associated proteins ("Membrane"), kinases ("Kinase"), and 
uncharacterized ORFs ("Unknown"), respectively. 

25 

FIGS. 5A-5C. Conventional methods confirm protein-lipid interactions detected by 
the proteome microarrays (Casamayor et al., 1999, Curr. Biol. 9:186; Guerra et al., 2000, 
Biosci. Rep. 20:41). 

A. PI(4,5)P2 liposomes were first adhered to a nitrocellulose membrane, which was 
30 blocked by BSA; a dilution series of RimlSp, Eno2p, and Hxklp, and a GST control were 

used to probe the membrane. The bound proteins were detected using the anti-GST 
antibodies and an enhanced chemiluminescence ("ECL") kit. 

B. A reverse assay was carried out to test for protein-lipid interactions. The 
proteins were prepared and spotted onto nitrocellulose filters in a dilution series and probed 

35 with the six different liposomes. As a control, the six liposomes were also added to the 
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BSA-blocked membrane. After extensive washing, the bound liposomes were detected 
using an HRP -conjugated streptavidin and an ECL kit. 

C. Linear correlation between the binding signals and the amounts of Rim 1 5p in a 
membrane assay. When liposome-binding signals of RimlSp from the membrane assay 
5 (FIG. 4B) were plotted against the concentration gradient of the spotted RimlSp, the 
binding signals of PI(4)P, PI(3,4)P2, PI(3)P, and PI(4,5)P2 correlated linearly with the 
amounts of RimlSp. The interaction of PI(4)P and Riml5p showed the highest affinity, 
which was at least three-fold higher than the affinity of the control PC with RimlSp. 

10 5. DETAILED DESCRIPTION OF THE INVENTION 

The present invention is based, in part, on Applicants' construction of a yeast 
proteome microarray containing approximately 80% of all proteins expressed in yeast, 
representing the first description of a proteome array for any species. The proteome chips 

15 of the invention can be used for global analyses of protein interactions and activities in a 
species. The use of proteome chips has significant advantages over existing approaches. 
One advantage of the proteome microarray technology presented here is that a large set of 
individual proteins can be directly screened in a high-throughput fashion for hundreds or 
even thousands of biochemical activities simultaneously. For example, an advantage of the 

20 proteome chip approach is that proteins can be directly screened in vitro for a wide variety 
of activities including, but not limited to, protein-drug interactions {e.g., drug target-drug 
interactions), protein-lipid interactions and enzymatic assays. In addition, a wide range of 
in vitro conditions can be readily tested. Furthermore, once the proteins are prepared, 
proteome screening is significantly faster and cheaper, and rapid data analysis in many 

25 microarray formats is compatible with existing equipment and analytical software. 

5.1 Proteome Arrays. 

The present invention encompasses a positionally addressable arrays comprising a 
plurality of proteins, with each protein being at a different position on a solid support, 
30 wherein the plurality of proteins comprises at least one protein encoded by at least 1%, 2%, 
3%, 4%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or 99% of the 
known genes in a single species, i.e., all protein isoforms and splice variants derived from a 
gene are considered one protein. 

The present invention encompasses a positionally addressable array comprising a 
35 plurality of proteins, with each protein being at a different position on a solid support, 
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wherein the plurality of proteins comprises at least 1%, 2%, 3%, 4%, 5%, 10%, 20%, 30%, 
40%, 50%, 60%, 70%, 80%, 90%, 95%, or 99% of all proteins expressed in a single species, 
where protein isoforms and splice variants are counted as a single protein. In one 
embodiment, the plurality of proteins comprises about 90%, 95%, or 99% of all proteins 
5 expressed in a species. In a particular embodiment, the plurality of proteins comprises 
about 93.5% of all proteins expressed in a species. 

The present invention also encompasses a positionally addressable array comprising 
a plurality of proteins, with each protein being at a different position on a solid support, 
wherein the plurality of proteins comprises at least 1, 2, 3, 4, 5, 10, 20, 30, 40, 50, 100, 200, 
10 500, 1000, 1500, 2000, 2500, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10,000, 100,000, 
500,000 or 1,000,000 protein(s) expressed in a single species. 

The present invention also encompasses a positionally addressable array comprising 
a plurality of proteins, with each protein being at a different position on a solid support, 
wherein the plurality of proteins in aggregate comprises at least 1, 2, 3, 4, 5, 10, 20, 30, 40, 
15 50, 100, 200, 500, 1000, 5000, 10000, 20000, 30000, 40000, or 50000 different known 
genes in a single species. 

The present invention also encompasses a positionally addressable array comprising 
a plurality of fusion proteins to a surface of a solid support, with each fusion protein being 
at a different position on the solid support, wherein the fusion protein comprises a first tag, 
20 a second tag, and a protein sequence encoded by genomic nucleic acid of an organism. In 
another embodiment, the protein sequence of the fusion protein need not be encoded in a 
genomic nucleic acid of an organism, but is a sequence for which it is desired to identify a 
function and/or activity of a binding protein. 

A positionally addressable array provides a configuration such that each probe or 
25 protein of interest is at a known position on the solid support thereby allowing the identity 
of each probe or protein to be determined from its position on the array. Accordingly, each 
protein on an array is preferably located at a known, predetermined position on the solid 
support such that the identity of each protein can be determined from its position on the 
solid support. 

30 In one embodiment, the species is a virus. In another embodiment, the species is a 

prokaryote. In another embodiment, the species is a eukaryote. In another embodiment, the 
species is a vertebrate. In yet another embodiment, the species is a mammal. In a particular 
embodiment, the species is an animal, including, but not limited to, an insect, primate, and 
rodent. In a specific embodiment, the species is a monkey, fruit fly, cow, horse, sheep, pig, 

35 chicken, turkey, quail, cat, dog, mouse, rat, rabbit, nematode or fish. In a preferred 
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embodiment, the species is a human. In another preferred embodiment, the species is a 
yeast. 

Proteins of the proteome chips of the invention include full-length proteins, portions 
of full-length proteins, and peptides, which can be prepared by recombinant overexpression, 

5 fragmentation of larger proteins, or chemical synthesis. Proteins can be overexpressed in 
cells derived from, for example, yeast, bacteria, insects, humans, or non-human mammals 
such as mice, rats, cats, dogs, pigs, cows and horses. Further, fusion proteins comprising a 
defined domain attached to a natural or synthetic protein can be used. Proteins of the 
proteome chips can be purified prior to being attached to the solid support of the chip. Also 

10 the proteins of the proteome chips can be purified, or further purified, during attachment to 
the proteome chip. 

Proteins can be embedded in artificial or natural membranes (e.g., liposomes, 
membrane vesicles) prior to, or at the time of attachment to the protein chip. In fact, the 
synthesis of certain proteins may preferably be conducted in the presence of artificial or 

15 natural membranes to, for example, promote protein folding, protein processing, retain 
activity, and/or prevent precipitation of the protein. 

Further, proteins can be attached to the solid support of the proteome chip. 
Alternatively, the proteins can be delivered into wells of the proteome chip, where they 
remain unbound to the solid support of the proteome chip. 

20 The present invention is also directed to compounds useful as solid supports for the 

proteome chips of the invention. The solid support can be constructed from materials such 
as, but not limited to, silicon, glass, quartz, polyimide, acrylic, polymethylmethacrylate 
(LUCITE®), ceramic, nitrocellulose, amorphous silicon carbide, polystyrene, and/or any 
other material suitable for microfabrication, microlithography, or casting. For example, the 

25 solid support can be a hydrophilic microtiter plate (e.g. , MILLIPORE™) or a nitrocellulose- 
coated glass slide. In a preferred embodiment, the solid support is a nitrocellulose-coated 
glass slide. Nitrocellulose-coated glass slides for making protein (and DNA) microarrays 
are commercially available (e.g., from Schleicher & Schuell (Keene, NH), which sells glass 
slides coated with a nitrocellulose based polymer (Cat. no. 10 484 182)). In a specific 

30 embodiment, each protein is spotted onto the nitrocellulose-coated glass slide using an 

OMNIGRID™ (GeneMachines, San Carlos, CA). The present invention contemplates other 
solid supports useful for constructing a protein chip, some of which are disclosed, for 
example, in co-pending United States Application No. 09/849,781, which was filed on 
May 4 ? 2001, and which is incorporated herein by reference in its entirety. 

35 
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In a particular embodiment, the solid support comprises a silicone elastomeric 
material such as, but not limited to, polydimethylsiloxane ("PDMS"). An advantage of 
silicone elastomeric materials is their flexible nature. 

In another particular embodiment, the solid support is a silicon wafer. The silicon 
5 wafer can be patterned and etched (see, e.g. , G. Kovacs, 1998, Micromachined Transducers 
Sourcebook , Academic Press; M. Madou, 1997, Fundamentals of Microfabrication . CRC 
Press. The etched wafer can be used to cast the proteome chips of the invention. 

hi one embodiment, the present invention provides a proteome chip comprising a 
solid support that is a flat surface such as, but not limited to, a glass slide. Dense protein 
10 arrays can be produced on, for example, glass slides, such that assays for the presence, 
amount, and/or functionality of proteins can be conducted in a high-throughput manner. 

Accordingly, in one embodiment, the proteome chip comprises a plurality of 
proteins that are applied to the surface of a solid support, wherein the density of the sites at 
which protein are applied is at least 100 sites/cm 2 , 1000 sites/cm 2 , 10,000 sites/cm 2 , 100,000 
15 sites/cm 2 , 1,000,000 sites/cm 2 , 10,000,000 sites/cm 2 , 25,000,000 sites/cm 2 , 10,000,000,000 
sites/cm 2 , or 10,000,000,000,000 sites/cm 2 . Each individual protein sample is preferably 
applied to a separate site on the chip. The identity of the protein(s) at each site on the chip 
is/are known. 

In another embodiment, the solid support has an array of wells. The use of 
20 microlithographic and micromachining fabrication techniques (see, e.g. , co-pending United 
States Application No. 09/849,781, filed on May 4, 2001, which is incorporated herein by 
reference in its entirety) can be used to create well arrays with a wide variety of dimensions 
ranging from hundreds of microns down to 100 nm or even smaller, with well depths of 
similar dimensions. In one embodiment, a silicon wafer is micromachined and acts as a 
25 master mold to cast wells of 400 jam diameter that are spaced 200 \im apart, for a well 

density of about 277 wells per cm 2 , with individual well volumes of about 30 nl for 100 \xm 
deep wells. 

In another embodiment, microlithographic micromachining is used to fabricate wells 
500 nm and 275 nm diameter, spaced 1 [im apart to yield well densities of over 44 million 
30 and over 61 million wells per cm 2 respectively. Higher densities are possible through closer 
spacing, as well as through smaller diameters. 

In another embodiment, precision laser micromachining techniques can be used to 
directly fabricate mold structures out of acrylic with dimensions ranging from greater than 
1.5 mm down to 500 ]im, with well spacing of about 500 |um. Volumes of these wells are 
35 in the 50-500 nl range. 
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Accordingly, in one embodiment, the proteome chip comprises a plurality of wells 
on the surface of a solid support, wherein the density of wells is at least 100 wells/cm 2 , 1000 
wells/cm 2 ? 10,000 wells/cm 2 , 1 00,000 wells/cm 2 , 1,000,000 wells/cm 2 , 10,000,000 
wells/cm 2 , 25,000,000 wells/cm 2 , 10,000,000,000 wells/cm 2 , or 10,000,000,000,000 

5 wells/cm 2 . The present invention contemplates variations of protein chips comprising a 
plurality of wells, which are disclosed, for example, in co-pending United States 
Application No. 09/849,781, filed on May 4, 2001, and which is incorporated herein by 
reference in its entirety. 

The present invention also contemplates variations in the shape, width-to-depth 

10 ratio and volume of wells in the proteome chip, which are disclosed, for example, in 

co-pending United States Application No. 09/849,781, filed on May 4, 2001, and which is 
incorporated herein by reference in its entirety. Such shapes include, but are not limited to 
circular, oval, rectangular, square, etc. The wells can also have, for example, square, round 
V-shaped or U-shaped bottoms. 

15 In one embodiment, the solid support comprises gold. In a preferred embodiment, 

the solid support comprises a gold-coated slide. In another embodiment, the solid support 
comprises nickel. In another preferred embodiment, the solid support comprises a 
nickel-coated slide. Solid supports comprising nickel are advantageous for purifying and 
attaching fusion proteins having a poly-histidine tag ("His tag"). In another embodiment, 

20 the solid support comprises nitrocellulose. In another preferred embodiment, the solid 
support comprises a nitrocellulose-coated slide. 

The invention further relates to compounds useful for derivatization of the proteome 
chip substrate. The proteins can be bound directly to the solid support, or can be attached to 
the solid support through a linker molecule or compound. The linker can be any molecule 

25 or compound that derivatizes the surface of the solid support to facilitate the attachment of 
proteins to the surface of the solid support. The linker may covalently or non-covalently 
bind the proteins or probes to the surface of the solid support. In addition, the linker can be 
an inorganic or organic molecule. In certain embodiments, the linker may be a silane, e.g., 
sianosilane, thiosilane, aminosilane, etc. The present invention contemplates compounds 

30 useful for derivatization of a protein chip, some of which are disclosed, for example, in 

co-pending United States Application No. 09/849,781, which was filed on May 4, 2001, and 
which is incorporated herein by reference in its entirety. 

Accordingly, in one embodiment, the proteins of the proteome chip are bound 
non-covalently to the solid support (e.g., by adsorption). Proteins that are non-covalently 

35 bound to the solid support can be attached to the surface of the solid support by a variety of 
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molecular interactions such as, for example, hydrogen bonding, van der Waals bonding, 
electrostatic, or metal-chelate coordinate bonding. In a particular embodiment, proteins are 
bound to a poly-lysine coated surface of the solid support. In addition, as described above, 
in certain embodiments, the proteins are bound to a silane {e.g., sianosilane, thiosilane, 
5 aminosilane, etc.) coated surface of the solid support. 

In addition, crosslinking compounds commonly known in the art, e.g. homo- or 
heterofunctional crosslinking compounds {e.g., bis[sulfosuccinimidyl]suberate, 
N-[gamma-maleimidobutyryloxy]succinimide ester, or 

l-ethyl-3-[3-dimethylaminopropyl]carbodiimide), may be used to attach proteins to the 

10 solid support via covalent or non-covalent interactions. 

In another embodiment, the proteins of the proteome chip are bound covalently to 
the solid support. For example, the proteins can be bound to the solid support by 
receptor-ligand interactions, which include interactions between antibodies and antigens, 
DNA-binding proteins and DNA, enzyme and substrate, avidin (or streptavidin) and biotin 

1 5 (or biotinylated molecules), and interactions between lipid-binding proteins and 
phospholipids (or membranes, vesicles, or liposomes comprising phospholipids). 

Purified proteins can be placed on an array using a variety of methods known in the 
art. In one embodiment, the proteins are printed onto the solid support. In a further 
embodiment, the proteins are attached to the solid support using an affinity tag. Use of an 

20 affinity tag different from that used to purify the proteins is preferred, since further 
purification is achieved when building the protein array. 

Accordingly, in a preferred embodiment, proteins of the proteome chip are 
expressed as fusion proteins having at least one heterologous domain with an affinity for a 
compound that is attached to the surface of the solid support. Suitable compounds useful 

25 for binding fusion proteins onto the solid support {i. e. , acting as binding partners) include, 
but are not limited to, trypsin/anhydrotrypsin, glutathione, immunoglobulin domains, 
maltose, nickel, or biotin and its derivatives, which bind to bovine pancreatic trypsin 
inhibitor, glutathione-S-transferase, Protein A or antigen, maltose binding protein, 
poly-histidine {e.g., HisX6 tag), and avidin/streptavidin, respectively. For example, Protein 

30 A, Protein G and Protein A/G are proteins capable of binding to the Fc portion of 

mammalian immunoglobulin molecules, especially IgG. These proteins can be covalently 
coupled to, for example, a Sepharose® support to provide an efficient method of purifying 
fusion proteins having a tag comprising an Fc domain. 

In a further embodiment, the proteins are bound directly to the solid support. In 

35 another further embodiment, the proteins are bound to the solid support via a linker. In a 
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particular embodiment, the proteins are attached to the solid support via a His tag. In 
another particular embodiment, the proteins are attached to the solid support via a 
3-glycidooxypropyltrimethoxysilane ("GPTS") linker. In a specific embodiment, the 
proteins are bound to the solid support via His tags, wherein the solid support comprises a 

5 flat surface. In a preferred embodiment, the proteins are bound to the solid support via His 
tags, wherein the solid support comprises a nickel-coated glass slide. 

The proteome chips of the present invention are not limited in their physical 
dimensions and can have any dimensions that are useful. Preferably, the proteome chip has 
an array format compatible with automation technologies, thereby allowing for rapid data 

10 analysis. Thus, in one embodiment, the proteome microarray format is compatible with 
laboratory equipment and/or analytical software. In a preferred embodiment, the proteome 
chip is the size of a standard microscope slide. In another preferred embodiment, the 
protein chip is designed to fit into a sample chamber of a mass spectrometer. 

1 5 5.2 Methods for Making and Purifying Proteins in a High Density Array 

Format. 

The present invention also relates to methods for making and isolating viral, 
prokaryotic or eukaryotic proteins in a readily scalable format, amenable to high-throughput 
analysis. Preferred methods include synthesizing and purifying proteins in an array format 

20 compatible with automation technologies. Accordingly, in one embodiment, the invention 
provides a method for making and isolating eukaryotic proteins comprising the steps of 
growing a eukaryotic cell transformed with a vector having a heterologous sequence 
operatively linked to a regulatory sequence, contacting the regulatory sequence with an 
inducer that enhances expression of a protein encoded by the heterologous sequence, lysing 

25 the cell, contacting the protein with a binding agent such that a complex between the protein 
and binding agent is formed, isolating the complex from cellular debris, and isolating the 
protein from the complex, wherein each step is conducted in a 96-well format. 

In one embodiment, the invention provides a method for making a positionally 
addressable array comprising the step of attaching a plurality of proteins to a surface of a 

30 solid support, with each protein being at a different position on the solid support, wherein 
the plurality of proteins comprises at least one protein encoded by at least 1%, 2%, 3%, 4%, 
5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or 99% of the known genes 
in a single species. 

In another embodiment, the invention provides a method for making a positionally 
35 addressable array comprising the step of attaching a plurality of proteins to a surface of a 
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solid support, with each protein being at a different position on the solid support, wherein 
the plurality of proteins comprises at least 1%, 2%, 3%, 4%, 5%, 10%, 20%, 30%, 40%, 
50%, 60%, 70%, 80%, 90%, 95%, or 99% of all proteins expressed in a single species, 
wherein protein isoforms and splice variants are counted as a single protein. 

5 In another embodiment, the invention provides a method for making a positionally 

addressable array comprising the step of attaching a plurality of proteins to a surface of a 
solid support, with each protein being at a different position on the solid support, wherein 
the plurality of proteins comprises at least 1, 2, 3, 4, 5, 10, 20, 30, 40, 50, 100, 200, 500, 
1000, 1500, 2000, 2500, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10,000, 100,000, 

10 500,000 or 1,000,000 protein(s) expressed in a single species. 

In yet another embodiment, the invention provides a method for making a 
positionally addressable array comprising the step of attaching a plurality of proteins to a 
surface of a solid support, with each protein being at a different position on the solid 
support, wherein the plurality of proteins in aggregate comprises at least 1, 2, 3, 4, 5, 10, 20, 

15 30, 40, 50, 100, 200, 500, 1000, 5000, 10000, 20000, 30000, 40000, or 50000 different 
known genes in a single species. 

In yet another embodiment, the invention provides a method for making a 
positionally addressable array comprising the step of attaching a plurality of proteins to a 
surface of a solid support, with each protein being at a different position on the solid 

20 support, wherein the protein is a fusion protein comprising a first tag, a second tag, and a 
protein encoded by genomic nucleic acid of an organism. 

In one embodiment, each step in the synthesis and purification procedures is 
conducted in an array amenable to rapid automation. Such arrays can comprise a plurality 
of wells on the surface of a solid support wherein the density of wells is at least 10, 20, 30, 

25 40, 50, 100, 1000, 10,000, 100,000, or 1,000,000 wells/cm 2 , for example. Alternatively, 
such arrays comprise a plurality of sites on the surface of a solid support, wherein the 
density of sites is at least 10, 20, 30, 40, 50, 100, 1000, 10,000, 100,000, or 1,000,000 
sites/cm 2 , for example. 

In a particular embodiment, eukaryotic proteins are made and purified in a 96-array 

30 format (i.e., each site on the solid support where processing occurs is one of 96 sites), e.g., 
in a 96-well microtiter plate. In a preferred embodiment, the solid support does not bind 
proteins (e.g., a non-protein-binding microtiter plate). 

In certain embodiments, proteins are synthesized by in vitro translation according to 
methods commonly known in the art. 

35 
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Any expression construct having an inducible promoter to drive protein synthesis 
can be used in accordance with the methods of the invention. Preferably, the expression 
construct is tailored to the cell type to be used for transformation. Compatibility between 
expression constructs and host cells are known in the art, and use of variants thereof are also 

5 encompassed by the invention. 

Any host cell that can be grown in culture can be used to synthesize the proteins of 
interest. Preferably, host cells are used that can overproduce a protein of interest, resulting 
in proper synthesis, folding, and posttranslational modification of the protein. Preferably, 
such protein processing forms epitopes, active sites, binding sites, etc. useful for assays to 

1 0 characterize molecular interactions in vitro that are representative of those in vivo. 

Accordingly, a eukaryotic cell (e.g., yeast, human cells) is preferably used to 
synthesize eukaryotic proteins. Further, a eukaryotic cell amenable to stable transformation, 
and having selectable markers for identification and isolation of cells containing 
transformants of interest, is preferred. Alternatively, a eukaryotic host cell deficient in a 

1 5 gene product is transformed with an expression construct complementing the deficiency. 
Cells useful for expression of engineered viral, prokaryotic or eukaryotic proteins are known 
in the art, and variants of such cells can be appreciated by one of ordinary skill in the art. 

For example, the InsectSelect system from Invitrogen (Carlsbad, CA, catalog no. 
K800-01), a non-lytic, single-vector insect expression system that simplifies expression of 

20 high-quality proteins and eliminates the need to generate and amplify virus stocks, can be 
used. A preferred vector in this system is pIB/V5-His TOPO TA vector (catalog no. 
K890-20). Polymerase chain reaction ("PGR") products can be cloned directly into this 
vector, using the protocols described by the manufacturer, and the proteins can be expressed 
with N-terminal histidine tags useful for purifying the expressed protein. 

25 Another eukaryotic expression system in insect cells, the BAC-TO-BAC™ system 

(LIFETECH™, Rockville, MD), can also be used. Rather than using homologous 
recombination, the BAC-TO-BAC™ system generates recombinant baculovirus by relying 
on site-specific transposition in E. coli. Gene expression is driven by the highly active 
polyhedrin promoter, and therefore can represent up to 25% of the cellular protein in 

30 infected insect cells. 

In a particular embodiment, yeast cultures are used to synthesize eukaryotic fusion 
proteins. Fresh cultures are preferably used for efficient induction of protein synthesis, 
especially when conducted in small volumes of media. Also, care is preferably taken to 
prevent overgrowth of the yeast cultures. In addition, yeast cultures of about 3 ml or less 

35 are preferable to yield sufficient protein for purification. To improve aeration of the 
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cultures, the total volume can be divided into several smaller volumes {e.g., four 0.75 ml 
cultures can be prepared to produce a total volume of 3 ml). 

Cells are then contacted with an inducer {e.g., galactose), and harvested. Induced 
cells are washed with cold {i.e., 4°C to about 15°C) water to stop further growth of the cells, 

5 and then washed with cold {i.e., 4°C to about 15°C) lysis buffer to remove the culture 

medium and to precondition the induced cells for protein purification, respectively. Before 
protein purification, the induced cells can be stored frozen to protect the proteins from 
degradation. In a specific embodiment, the induced cells are stored in a semi-dried state at 
-80°C to prevent or inhibit protein degradation. 

1 0 Cells can be transferred from one array to another using any suitable mechanical 

device. For example, arrays containing growth media can be inoculated with the cells of 
interest using an automatic handling system {e.g., automatic pipette). In a particular 
embodiment, 96-well arrays containing a growth medium comprising agar can be inoculated 
with yeast cells using a 96-pronger. Similarly, transfer of liquids {e.g., reagents) from one 

1 5 array to another can be accomplished using an automated liquid-handling device {e.g. , 
Q-FILL™, Genetix, UK). 

Although proteins can be harvested from cells at any point in the cell cycle, cells are 
preferably isolated during logarithmic phase when protein synthesis is enhanced. For 
example, yeast cells can be harvested between OD 6m ^Q3 and OD 600 =1.5, preferably 

20 between OD 600 =0.5 and OD 600 =1 .5. In a particular embodiment, proteins are harvested from 
the cells at a point after mid-log phase. Harvested cells can be stored frozen for future 
manipulation. 

The harvested cells can be lysed by a variety of methods known in the art, including 
mechanical force, enzymatic digestion, and chemical treatment. The method of lysis should 

25 be suited to the type of host cell. For example, a lysis buffer containing fresh protease 
inhibitors is added to yeast cells, along with an agent that disrupts the cell wall {e.g., sand, 
glass beads, zirconia beads), after which the mixture is shaken violently using a shaker {e.g., 
vortexer, paint shaker). 

In a specific embodiment, zirconia beads are contacted with the yeast cells, and the 

30 cells lysed by mechanical disruption by vortexing. Li a further embodiment, lysing of the 
yeast cells in a high-density array format is accomplished using a paint shaker. The paint 
shaker has a platform that can firmly hold at least eighteen 96-well boxes in three layers, 
thereby allowing for high-throughput processing of the cultures. Further the paint shaker 
violently agitates the cultures, even before they are completely thawed, resulting in efficient 

35 disruption of the cells while minimizing protein degradation. In fact, as determined by 
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microscopic observation, greater than 90% of the yeast cells can be lysed in under two 
minutes of shaking. 

The resulting cellular debris can be separated from the protein and/or other 
molecules of interest by centrifugation. Additionally, to increase purity of the protein 

5 sample in a high-throughput fashion, the protein-enriched supernatant can be filtered, 
preferably using a filter on a non-protein-binding solid support. To separate the soluble 
fraction, which contains the proteins of interest, from the insoluble fraction, use of a filter 
plate is highly preferred to reduce or avoid protein degradation. Further, these steps 
preferably are repeated on the fraction containing the cellular debris to increase the yield of 

10 protein. 

Proteins can then be purified from the protein-enriched supernatant using a variety 
of affinity purification methods known in the art. Affinity tags useful for affinity 
purification of fusion proteins by contacting the fusion protein preparation with the binding 
partner to the affinity tag, include, but are not limited to, calmodulin, 

15 trypsin/anhydrotrypsin, glutathione, immunoglobulin domains, maltose, nickel, or biotin 
and its derivatives, which bind to calmodulin-binding protein, bovine pancreatic trypsin 
inhibitor, glutathione-S-transferase ("GST tag"), antigen or Protein A, maltose binding 
protein, poly-histidine ("His tag"), and avidin/streptavidin, respectively. Other affinity tags 
can be, for example, myc or FLAG. Fusion proteins can be affinity purified using an 

20 appropriate binding compound (i.e., binding partner such as a glutathione bead), and 
isolated by, for example, capturing the complex containing bound proteins on a 
non-protein-binding filter. Placing one affinity tag on one end of the protein (e.g., the 
carboxy-terminal end), and a second affinity tag on the other end of the protein (e.g., the 
amino-terminal end) can aid in purifying full-length proteins. 

25 In a particular embodiment, the fusion proteins have GST tags and are affinity 

purified by contacting the proteins with glutathione beads. In further embodiment, the 
glutathione beads, with fusion proteins attached, can be washed in a 96-well box without 
using a filter plate to ease handling of the samples and prevent cross contamination of the 
samples. 

30 in addition, fusion proteins can be eluted from the binding compound (e.g. , 

glutathione bead) with elution buffer to provide a desired protein concentration. In a 
specific embodiment, fusion proteins are eluted from the glutathione beads with 30 jlxI of 
elution buffer to provide a desired protein concentration. 

For purified proteins that will eventually be spotted onto microscope slides, the 

35 glutathione beads are separated from the purified proteins. Preferably, all of the glutathione 
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beads are removed to avoid blocking of the microarrays pins used to spot the purified 
proteins onto a solid support. In a preferred embodiment, the glutathione beads are 
separated from the purified proteins using a filter plate, preferably comprising a 
non-protein-binding solid support. Filtration of the eluate containing the purified proteins 

5 should result in greater than 90% recovery of the proteins. 

The elution buffer preferably comprises a liquid of high viscosity such as, for 
example, 15% to 50% glycerol, preferably about 40% glycerol. The glycerol solution 
stabilizes the proteins in solution, and prevents dehydration of the protein solution during 
the printing step using a microarrayer. 

10 Purified proteins are preferably stored in a medium that stabilizes the proteins and 

prevents dessication of the sample. For example, purified proteins can be stored in a liquid 
of high viscosity such as, for example, 15% to 50% glycerol, preferably in about 40% 
glycerol. It is preferred to aliquot samples containing the purified proteins, so as to avoid 
loss of protein activity caused by freeze/thaw cycles. 

1 5 The skilled artisan can appreciate that the purification protocol can be adjusted to 

control the level of protein purity desired. In some instances, isolation of molecules that 
associate with the protein of interest is desired. For example, dimers, trimers, or higher 
order homotypic or heterotypic complexes comprising an overproduced protein of interest 
can be isolated using the purification methods provided herein, or modifications thereof. 

20 Furthermore, associated molecules can be individually isolated and identified using methods 
known in the art (e.g., mass spectroscopy). 

5.3 Methods for Making a Proteome Array. 

The present invention is also directed to methods of making proteome chips. 

25 Accordingly, the invention provides methods for constructing a positionally addressable 
array comprising the step of attaching a plurality of proteins to a surface of a solid support, 
with each protein being at a different position on the solid support, wherein the plurality of 
proteins comprises at least one representative protein for at least 1%, 2%, 3%, 4%, 5% ? 
10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or 99% of the known genes in a 

30 species, wherein the protein is all protein isoforms and splice variants derived from a gene. 
In another embodiment, the invention provides methods for constructing a 
positionally addressable array comprising the step of attaching a plurality of proteins to a 
surface of a solid support, with each protein being at a different position on the solid 
support, wherein the plurality of proteins comprises at least 1%, 2%, 3%, 4%, 5%, 10%, 
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20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or 99% of all proteins expressed in a 
species. 

In another embodiment, the present invention provides a method for constructing a 
positionally addressable array comprising the step of attaching a plurality of proteins to a 

5 surface of a solid support, with each protein being at a different position on the solid 

support, wherein the plurality of proteins comprises at least 1, 2, 3, 4, 5, 10, 20, 30, 40, 50, 
100, 200, 500, 1000, 1500, 2000, 2500, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10,000, 
100,000, 500,000 or 1,000,000 protein(s) expressed in a species. 

The present invention also relates to methods for making a positionally addressable 

1 0 array comprising the step of attaching a plurality of proteins to a surface of a solid support, 
with each protein being at a different position on the solid support, wherein the solid 
support is cast from a microfabricated mold, and wherein the plurality of proteins comprises 
at least 1%, 2%, 3%, 4%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or 
99% of all proteins expressed in a species, or comprises at least 1, 2, 3, 4, 5, 10, 20, 30, 40, 

15 50, 100, 200, 500, 1000, 1500, 2000, 2500, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 
10,000, 100,000, 500,000 or 1,000,000 protein(s) expressed in a species, or comprises at 
least one representative protein for at least 1%, 2%, 3%, 4%, 5%, 10%, 20%, 30%, 40%, 
50%, 60%, 70%, 80%, 90%, 95%, or 99% of the known genes in a species, wherein the 
protein is all protein isoforms and splice variants derived from a gene. The present 

20 invention contemplates a variety of solid supports cast from a microfabricated mold, some 
of which are disclosed, for example, in co-pending United States Application No. 
09/849,781, filed on May 4, 2001, which is incorporated herein by reference in its entirety. 

The present invention also relates to methods for making a positionally addressable 
array comprising the step of attaching a plurality of proteins to a surface of a solid support, 

25 with each protein being at a different position on the solid support, wherein the protein 
comprises a first tag and a second tag. The advantages of using double-tagged proteins 
include the ability to obtain highly purified proteins, as well as providing a streamlined 
manner of purifying proteins from cellular debris and attaching the proteins to a solid 
support. In a particular embodiment, the first tag is a glutathione-S-transferase tag 

30 ("GST tag") and the second tag is a poly-histidine tag ("His tag"). In a further embodiment, 
the GST tag and the His tag are attached to the amino-terminal end of the protein. 
Alternatively, the GST tag and the His tag are attached to the carboxy-terminal end of the 
protein. 

In yet another embodiment, the GST tag is attached to the amino-terminal end of the 
35 protein. In a further embodiment, the His tag is attached to the carboxy-terminal end of the 
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protein. In yet another embodiment, the His tag is attached to the amino-terminal end of the 
protein. In a further embodiment, the GST tag is attached to the carboxy-terminal end of the 
protein. 

In yet another embodiment, the protein comprises a GST tag and a His tag, and 

5 neither the GST tag nor the His tag is located at the amino-terminal or carboxy-terminal end 
of the protein. In a specific embodiment, the GST tag and His tag are located within the 
coding region of the protein of interest; preferably in a region of the protein not affecting the 
binding domain of interest. 

In one embodiment, the first tag is used to purify a fusion protein. In another 

1 0 embodiment, the second tag is used to attach a fusion protein to a solid support. In a 
specific further embodiment, the first tag is a GST tag and the second tag is a His tag. 

The protein preferably is a fusion protein such that the heterologous sequence 
comprises the coding region for the protein of interest and sequences encoding a tag, such as 
an affinity tag. Such tags can be useful for monitoring the protein, separating the fusion 

1 5 protein from cellular debris and contaminating reagents, and/or attaching the protein to a 
proteome chip of the invention. 

Examples of inducers include, but are not limited to, galactose, enhancer-binding 
proteins, and other transcription factors. In one embodiment, galactose is contacted with a 
regulatory sequence comprising a galactose-inducible GAL1 promoter. 

20 A binding agent that can be used in accordance with the invention includes, but is 

not limited to, a glutathione bead, a nickel-coated solid support, and an antibody. In one 
embodiment, the complex comprises a fusion protein having a GST tag bound to a 
glutathione bead. In another embodiment, the complex comprises the a fusion protein 
having a His tag bound to a nickel-coated solid support. In yet another embodiment, the 

25 complex comprises the protein of interest bound to an antibody and, optionally, a secondary 
antibody. 

5.4 Methods for Using a Proteome Array. 

The invention is also directed to methods for using proteome chips to assay the 
30 presence, amount, and/or functionality of proteins present in at least one sample. Using the 
proteome chips of the invention, chemical reactions and assays in a large-scale parallel 
analysis can be performed to characterize biological states or biological responses, and 
determine the presence, amount, and/or biological activity of proteins. Accordingly, the 
proteome chips of the invention can be used to assay for essentially all protein-protein 
35 interactions in a cell, tissue, organ, system, or organism. 
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Additionally, the proteome chips of the invention can be used to assay for biological 
responses to a particular stimulus given to a host cell (i.e. , a cell used to produce the fusion 
protein). For example, yeast cells transformed with expression vectors encoding fusion 
proteins can be subjected to a stimulus, after which the fusion proteins are purified and 

5 arrayed. The arrayed proteins can then be characterized by, for example, probing with any 
binder. The resulting binding pattern is then compared to an identical array produced from 
yeast cells not subjected to the stimulus, or subjected to a different stimulus. Differences in 
the binding patterns can be characteristic of the biological response, and can identify 
specific interactors of interest with respect to the biological response. 

10 in one embodiment, proteome chips prepared from host cells, each chip representing 

host cells exposed to a different stimulus, are screened with a labeled lectin (e.g., 
concanavalin A). The pattern of protein-probe interactions, indicating the presence of 
glycosylated proteins, is compared to determine the effect of each stimulus on the 
glycosylation state of the proteins of the proteome chip. 

1 5 Biological activity that can be determined using a proteome chip of the invention 

includes, but is not limited to, enzymatic activity (e.g., kinase activity, protease activity, 
phosphatase activity, glycosidase, acetylase activity, and other chemical group transferring 
enzymatic activity), nucleic acid binding, hormone binding, etc. High density and small 
volume chemical reactions can be advantageous for the methods relating to using the 

20 proteome chips of the invention. 

Further information regarding biological states or responses can be obtained using 
the proteome chips of the invention, wherein proteins on the chip are organized according to 
a classification of proteins. The classification can be by abundance, function, functional 
class, enzymatic activity, homology, protein family, association with a particular metabolic 

25 or signal transduction pathway, association with a related metabolic or signal transduction 
pathway, or posttranslational modification. 

Upon contacting the proteins of a proteome chip of the invention with one or more 
probes, protein-probe interactions can be assayed using a variety of techniques known in the 
art. For example, the proteome chip can be assayed using standard enzymatic assays that 

30 produce chemiluminescence or fluorescence. Various protein modifications can be detected 
by, for example, photoluminescence, chemiluminescence, or fluorescence using non-protein 
substrates, enzymatic color development, mass spectroscopic signature markers, or 
amplification of oligonucleotide tags. 

The probe is labeled or tagged with a marker so that its binding can be detected, 

35 directly or indirectly, by methods commonly known in the art. Any art-known marker may 
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be used, including but not limited to tags such as epitope tags, haptens, and affinity tags, 
antibodies, labels, etc., providing that it is not the same as the affinity tag or reagent used to 
attach the protein(s) of the proteome chip to the solid substrate of the chip. For example, if 
biotin is used as a linker to attach proteins to a proteome chip array, then another tag not 

5 present in the protein(s) of the proteome chip, e.g. , His or GST, is used to label the probe 
and to detect a protein-probe interaction. In certain embodiments, a photoluminescent, 
chemiluminescent, fluorescent, or enzymatic tag is used. In other embodiments, a mass 
spectroscopic signature marker is used. In yet other embodiments, an amplifiable 
oligonucleotide, peptide or molecular mass label is used. 

10 In a specific embodiment, the invention provides a method for detecting a protein- 

probe interaction comprising the steps of contacting a sample of labeled probe {e.g., labeled 
protein) with a positionally addressable array comprising a plurality of proteins, with each 
protein being at a different position on a solid support; and detecting any positions on the 
array wherein interaction between the labeled probe and a protein on the array occurs. 

15 Accordingly, protein-probe interactions can be detected by, for example, 1) using 

radioactively labeled ligand followed by autoradiography and/or phosphoimager analysis; 
2) binding of hapten, which is then detected by a fluorescently labeled or enzymatically 
labeled antibody or high-affinity hapten ligand such as biotin or streptavidin; 3) mass 
spectrometry; 4) atomic force microscopy; 5) fluorescent polarization methods; 6) infrared 

20 red labeled compounds or proteins; 7) amplifiable oligonucleotides, peptides or molecular 
mass labels; 8) stimulation or inhibition of the protein's enzymatic activity; 9) rolling 
circle amplification-detection methods (Hatch et al., 1999, "Rolling circle amplification of 
DNA immobilized on solid surfaces and its application to multiplex mutation detection", 
Genet. Anal. 15:35-40); 10) competitive PGR (Fini et al., 1999, "Development of a 

25 chemiluminescence competitive PGR for the detection and quantification of parvovirus B 1 9 
DNA using a microplate luminometer", Clin Chem. 45:1391-6; Kruse et al., 1999, 
"Detection and quantitative measurement of transforming growth factor-betal (TGF-betal) 
gene expression using a semi-nested competitive PCR assay", Cytokine 1 1:179-85; 
Guenthner and Hart, 1998, "Quantitative, competitive PCR assay for HIV-1 using a 

3 0 microplate-based detection system", Biotechniques 24 : 8 1 0-6); 11) colorimetric procedures; 
and 12) biological assays (e.g., for virus titers). 

In a particular embodiment, protein-probe interactions are detected by direct mass 
spectrometry. In a further embodiment, the identity of the protein and/or probe is 
determined using mass spectrometry. For example, one of more probes that have bound to a 

35 protein on the proteome chip can be dissociated from the array, and identified by mass 
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spectrometry (see, e.g., WO 98/59361). In another example, enzymatic cleavage of a 
protein on the proteome chip can be detected, and the cleaved protein fragments or other 
released compounds can be identified by mass spectrometry. 

In one embodiment, each protein on the proteome chip is contacted with a probe, 

5 and the protein-probe interactions are detected and quantified. In another embodiment, each 
protein on the proteome chip is contacted with multiple probes, and the protein-probe 
interaction is detected and quantified. For example, the proteome chip can be 
simultaneously screened with multiple probes including, but not limited to, complex 
mixtures (e.g., cell extracts), intact cellular components (e.g., organelles), whole cells, and 

10 probes pooled from several sources. The protein-probe interactions are then detected and 
quantified. Useful information can be obtained from assays using mixtures of probes due, 
in part, to the positionally addressable nature of the arrays of the present invention, i.e., via 
the placement of proteins at known positions on the protein chip, the protein to which the 
probe binds ("interactor") can be characterized. 

1 5 One of ordinary skill in the art can appreciate many different embodiments for 

assaying various cellular interactions by using probes to screen the proteome chips of the 
invention. For example, multiple sequential screens of a proteome chip with various probes 
can define all proteins involved in a particular signal transduction pathway or in a specific 
metabolic pathway. Moreover, these assays can be useful for diagnostic, prognostic and/or 

20 therapeutic purposes. 

In accordance with the methods of the invention, a probe can be a cell, cell 
membrane, subcellular organelles, protein-containing cellular material, protein, 
oligonucleotide, polynucleotide, DNA, RNA, small molecule (i.e., a compound with a 
molecular weight of less than 500), substrate, drug or drug candidate, receptor, antigen, 

25 steroid, phospholipid, antibody, immunoglobulin domain, glutathione, maltose, nickel, 
dihydrotrypsin, lectin, or biotin. 

Probes can be biotinylated for use in contacting a protein array so as to detect 
protein-probe interactions. Weakly biotinylated proteins are more likely to maintain the 
biological activity of interest. Thus, a gentler biotinylation procedure is preferred so as to 

30 preserve the protein's binding activity or other biological activity of interest. Accordingly, 
in a particular embodiment, probe proteins are biotinylated to differing degrees using a 
biotin-transferring compound (e.g., Sulfo-NHS-LC-LC-Biotin; PIERCE™ Cat. No. 21338, 
USA). 

In addition, the probe can be an enzyme substrate or inhibitor. For example, the 
35 probe can be a substrate or inhibitor of an enzyme such as, but not limited to, kinases, 
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phosphatases, proteases, glycosidases, acetylases, and other group transferring enzymes. 
After incubation of proteins on a chip with combinations of nucleic acid or protein probes, 
the bound nucleic acid or protein probes can be identified, for example, by mass 
spectrometry (Lakey et al., 1998, "Measuring protein-protein interactions", Curr Opin Struct 

5 Biol. 8:119-23). 

Accordingly, various cellular responses to interaction with the proteins on a 
proteome chip can be assayed by probing with whole cells. For example, a proteome chip 
can be contacted with lymphocytes and assayed for lymphocyte activation by various means 
including, but not limited to, detecting antibody synthesis, detecting or measuring 

1 0 incorporation of 3 H-thymidine, labeling cell surface molecules with antibodies to identify 
molecules induced or suppressed by antigen recognition and activation (e.g., CD23, CD38, 
IgD, C3b receptor, IL-2 receptor, transferrin receptor, membrane class II MHC molecules, 
PCA-1 molecules, HLA-DR), and identifying expressed and/or secreted cytokines. 
In another example, mitogens for a specific cell-type can be determined by 

15 incubating a cell with a proteome chip. Mitotic activity can be determined, for example, by 
detecting or measuring incorporation of 3 H-thymidine by a cell. Cells can be of the same 
cell type (i.e., a homogeneous population) or can be of different cell types. 

In another example, differentiation factors for a specific cell-type can be determined 
by incubating a cell with a proteome chip. Differentiation of a cell can be determined, for 

20 example, by visual inspection, detection of cell-surface differentiation markers using 
marker-specific antibodies, or identification of secreted differentiation markers. 

In another example, apoptotic factors for a specific cell-type can be determined by 
incubating a cell with a proteome chip. Apoptosis can be assayed, for example, by visual 
inspection, detection of cell-surface apoptotic markers using marker-specific antibodies, or 

25 identification of secreted markers or other cellular components released into the media. 

In another example, the secretory response of a cell to a protein on a proteome chip 
can be assayed by incubating a cell with a proteome chip of the invention. Secreted proteins 
and other cellular compounds can be assayed, for example, by detecting the released 
compounds in the media. 

30 In another example, the ability of a protein on a proteome chip to mediate cell 

aggregation can be assayed, for example, by incubating one or more cells with a proteome 
chip of the invention, and assaying for aggregation. Also, a protein's ability to mediate an 
affinity to extracellular matrix can be assayed by, for example, incubating a cell and 
extracellular matrix components with a proteome chip, and assaying for enhanced affinity of 

35 the cell or the extracellular matrix component with a protein on the chip. Interactors 
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identified in such assays can have a role in, for example, cancer, cell migration, 
synaptogenesis, dendritic growth, process extension, or axonal elongation. 

In yet another example, the effect of proteins of a proteome chip of the invention on 
ion transport, or other small molecule transport (e.g., ATP), can be determined. For 

5 example, the probe cells can be pre-loaded with a radioactively labeled ion or other small 
molecule, and incubated with a proteome chip of the invention. Retention or release of the 
radioactive label can be measured at different time points after contacting the cells with the 
proteins of the proteome array. Alternatively, ion transport can be detected and 
characterized using electrophysiological techniques known in the art. 

10 In yet another example, cellular uptake and/or processing of proteins on the 

proteome chips can be assayed by, for example, incubating a cell with a proteome chip 
having radioactively or fluorescently labeled proteins on the chip, and measuring the 
increase or decrease in signal on the proteome chip, or measuring uptake of labeled protein 
by the cell. 

15 Alternatively, a proteome chip of the invention can be incubated with a cell and a 

labeled compound of interest, such that cellular uptake and/or processing of the compound 
by the cell is detected and/or measured. 

Interactions of small molecules (z.e., compounds smaller than MW=500) with the 
proteins on a proteome chip also can be assayed in a cell-free system by probing with small 

20 molecules such as, but not limited to, ATP, GTP, cAMP, phosphotyrosine, phosphoserine, 
and phosphothreonine. Such assays can identify all proteins in a species that interact with a 
small molecule of interest. Small molecules of interest can include, but are not limited to, 
pharmaceuticals, drug candidates, fungicides, herbicides, pesticides, carcinogens, and 
pollutants. Small molecules used as probes in accordance with the methods of the invention 

25 preferably are non-protein, organic compounds. 

In another embodiment, essentially all receptors for a particular ligand, or class of 
ligands, in a species can be identified by contacting a receptor of interest with a proteome 
chip of the invention. Alternatively, essentially all ligands in a species that are identified by 
a particular receptor or receptor family of interest can be identified by contacting a receptor 

30 of interest with a proteome chip of the invention. In another embodiment, essentially all 
proteins in a species, capable of inhibiting or blocking formation of a particular 
receptor-ligand complex, can be identified by contacting a receptor and its ligand with a 
proteome chip of the invention, and determining whether receptor-ligand interaction is 
inhibited as compared with the degree of receptor-ligand interaction in the absence of the 

35 
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protein on the chip. Detection of receptor-ligand interaction and identification of the ligand 
interactors can be accomplished using methods known in the art. 

In another embodiment, essentially all kinase targets in a species can be identified 
by, for example, contacting a kinase with a proteome chip of the invention, and in the 

5 presence of labeled phosphate, detecting phosphorylated interactors using methods known 
in the art. Alternatively, essentially all kinases in a species can be identified by contacting a 
substrate that can be phosphorylated with a proteome chip of the invention, and assaying the 
presence and/or level of phosphorylated substrate by, for example, using an antibody 
specific to a phosphorylated amino acid. In another embodiment, essentially all kinase 

10 inhibitors in a species can be identified by contacting a kinase and its substrate with a 

proteome chip of the invention, and determining whether phosphorylation of the substrate is 
reduced as compared with the level of phosphorylation in the absence of the protein on the 
chip. 

Detection methods for kinase activity are known in the art, and include, but are not 

1 5 limited to, the use of radioactive labels {e.g. , 33 P-ATP and 35 S-y-ATP) or fluorescent 
antibody probes that bind to phosphoamino acids. 

Similarly, assays can be conducted to identify all phosphatases, and inhibitors of a 
phosphatase, in a species. For example, whereas incorporation into a protein of 
radioactively labeled phosphorus indicates kinase activity in one assay, another assay can be 

20 US ed to measure the release of radioactively labeled phosphorus into the media, indicating 
phosphatase activity. 

The proteome chips of the invention can also be used to distinguish different cell 
types (either morphological or functional) by, for example, contacting a proteome chip with 
cells or cell extracts representing different populations of cells, and comparing the patterns 

25 of protein-probe interactions on the proteome chip. This approach also can be used to 
characterize, for example, different stages of the cell cycle, disease states, altered 
physiologic states (e.g., hypoxia), physiological state before or after treatment (e.g., drug 
treatment), metabolic state, stage of differentiation, developmental stage, response to 
environmental stimuli (e.g., light, heat), response to environmental toxins (e.g., pesticides, 

30 herbicides, pollution), cell-cell interactions, cell-specific protein expression, and 
disease-specific protein expression. 

Developmental profiles of protein-protein interactions can be used to characterize 
signal transduction pathways, metabolic pathways, etc. involved at every development stage 
and elucidate transitions between developmental stages. The wealth of information 

35 
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provided by such studies can be used to identify drug targets for each stage, and/or tailor 
treatment regimens during the course of a disease. 

The proteome chips of the invention can be incubated with cell extracts to 
characterize a particular cell type, response to a stimulus, or physiological state. 

5 Accordingly, in exemplary embodiments, a proteome chip of the invention can be contacted 
with a cell extract from cells treated with a compound (e.g., a drug), or from cells at a 
particular stage of cell differentiation (e.g., pluripotent), or from cells in a particular 
metabolic state (e.g., mitotic), and assayed for kinase, protease, glycosidase, actetylase, 
phosphatase, and/or other transferase activity, for example. 

10 The pattern of protein-probe interactions on the proteome chip can thereby provide a 

"signature" or "fingerprint" characteristic of the biological state. For example, the results 
obtained from such assays, comparing for example, cells in the presence or absence of a 
drug, or cells at several differentiation stages, or cells in different metabolic states, can 
provide a signature of each condition, and can provide information regarding the 

15 physiologic changes in the cells under the different conditions. 

Clearly, by screening a species's proteome using a plurality of probes (e.g., known 
mixtures of probes, cellular extracts, subcellular organelles, cell membrane preparations, 
whole cells, etc.), the resulting analysis of protein-probe interactions can form the basis of 
identifying a "fingerprint" or "signature" of the a cell-type or physiological state of a cell, 

20 tissue, organ or system. Such information can be useful for diagnosis, prognosis, drug 
testing, and drug discovery, for example. 

Accordingly, the proteome chips of the invention can be used to determine a drug's 
interactions with proteins on the chip. Alternatively, the proteome chips of the invention 
can be used to characterize a drug's effects on complex protein mixtures such as, for 

25 example, whole cells, cell extracts, or tissue homogenates. For example, a proteome chip 
can be contacted with a complex protein mixture and assayed for altered interactions of the 
protein mixture with the proteins on the chip when compared in presence or absence of 
drug. 

The net effect of a drug can thereby be analyzed by screening one or more proteome 
30 chips with drug-treated cells, tissues, or extracts, which then can provide a "signature" for 
the drug-treated state, and when compared with the "signature" of the untreated state, can be 
of predictive value with respect to, for example, potency, toxicity, and side effects. 
Furthermore, time-dependent effects of a drug can be assayed by, for example, adding the 
drug to the cell, cell extract, tissue homogenate, or whole organism, and applying the 

35 
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drug-treated cells or extracts, prepared at various timepoints of the treatment, to a proteome 
chip. Such assays can be useful for diagnosis or prognosis of a disease. 

In particular, the proteome chips of the invention can be useful for characterizing a 
mode of action of a drug, determining drug specificity, predicting drug toxicity, and for drug 

5 discovery. For example, the identity of proteins that bind to a drug, and their relative 

affinities, can be assayed by incubating a proteome chip with a drug or drug candidate under 
different assay conditions, determining drug specificity by determining where on the array 
the drug bound, and measuring the amount of drug bound by each different protein. 

The proteome chips of the invention can be used to determine a disease state by, for 

10 example, contacting a proteome chip with diseased cells, cell extracts or tissue homogenates 
from diseased tissue, or body fluids from a patient suffering from a disease, and comparing 
the pattern of protein-probe interactions on the proteome chip with that of a healthy 
counterpart. Such assays can provide a "signature" for the disease state, and when 
compared with the "signature" of the healthy state, can be of predictive value with respect 

1 5 to, diagnosis or prognosis of the disease. Furthermore, stages of a disease can be 

characterized by, for example, assaying biological preparations on the proteome chip at 
various stages of the disease. 

Bioassays in which a biological activity is assayed, rather than binding assays, can 
also be conducted out on the same proteome chip, or on an identical second chip. Thus, 

20 these types of assays using the protein chips of the invention are useful for studying drug 
specificity, predicting potential side effects of drugs, and classifying drugs. 

Further, proteome chips of the invention are suitable for screening complex libraries 
of drug candidates. Specifically, the proteins on the chip can be incubated with the library 
of drug candidates, and then the bound components can be identified, e.g., by mass 

25 spectrometry, which allows for the simultaneous identification of all library components 
that bind preferentially to specific subsets of proteins, or bind to several of the proteins on 
the chip. Additionally, the relative affinity of the drug candidates for the different proteins 
in the array can be determined. 

Moreover, the protein chips of the present invention can be probed in the presence 

30 of potential inhibitors, catalysts, modulators, or enhancers of an observed interaction, 

enzymatic activity, or biological response. Using a proteome chip of the present invention, 
such strategic screens can identify proteins expressed in a species that, for example, block 
the binding of a drug, inhibit of viral infection, exhibit bacteriostatic activity, exhibit 
anti-fungal activity, ameliorate parasitic infection, or physiological effectors to specific 

3 5 categories of proteins. 
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Enzymatic reactions can be performed and enzymatic activity measured using the 
proteome chips of the present invention. In a specific embodiment, compounds that 
modulate the enzymatic activity of a protein or proteins on a chip can be identified. For 
example, changes in the level of enzymatic activity can be detected and quantified by 

5 incubating a compound or mixture of compounds with an enzymatic reaction mixture, 
thereby producing a signal {e.g., from substrate that becomes fluorescent upon enzymatic 
activity). Differences between the presence and absence of a test compound can be 
characterized. Furthermore, the differences in a compound's effect on enzymatic activities 
can be detected by comparing their relative effect on samples within the proteome chip and 

1 0 between chips. 

A variety of strategies of using the proteome chips of the present invention can be 
employed to determine various physical and functional characteristics of proteins. For 
example, the protein chips can be used to assess the presence and amount of protein present 
by probing with an antibody. The protein can be detected using standard detection assays 

15 such as luminescence, chemiluminescence, fluorescence, chemifluorescence, or mass 

spectrometry. For example, a primary antibody to the protein of interest is recognized by a 
fluorescently labeled secondary antibody, which is then measured with an instrument (e.g., a 
Molecular Dynamics scanner) that excites the fluorescent product with a light source and 
detects the subsequent fluorescence. For greater sensitivity, a primary antibody to the 

20 protein of interest is recognized by a secondary antibody that is conjugated to an enzyme 
such as alkaline phosphatase or horseradish peroxidase. In the presence of a luminescent 
substrate (for chemiluminescence) or a fluorogenic substrate (for chemifluorescence), 
enzymatic cleavage yields a highly luminescent or fluorescent product which can be 
detected and quantified by using, for example, a Molecular Dynamics scanner. 

25 Alternatively, the signal of a fluorescently labeled secondary antibody can be amplified 
using an alkaline phosphatase-conjugated or horseradish peroxidase-conjugated tertiary 
antibody. 

In one embodiment, a proteome chip of the invention can probed with antibodies 
directed against known proteins in one species, such that homologous proteins having 
30 recognized epitopes can be identified in the proteome of another species. Same species 
homologs of the interactors can be obtained by, for example, using the DNA sequence 
information of the homologous protein to identify the homolog in the other species. In 
specific embodiments, the antibody is directed against cyclin, kinase, GST, Clb5, Cla4, 
Ste20, Cdc42, PI(3,4)P2, PI(4)P, SPA2, CLB1, CLB2, or Cdcll. 
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In another embodiment, a proteome chip of the invention containing proteins of a 
first species can probed with a protein from a second species to identify interactors. 
Homologs in the second species of the interactors from the first species can then be 
identified and characterized by, for example, nucleotide sequence homology. Thus, where 

5 the proteome is not available for a particular species, this strategy can be used to find same 
species proteins that interact with a protein of interest. 

The proteome chips of the invention can be used to identify essentially all substrates 
in a species for each enzyme found in a species. Accordingly, identifying substrates of 
protein kinases, phosphatases, proteases, glycosidases, acetylases, or other group 

1 0 transferring enzymes can be conducted on the protein chips of the present invention. 

In one embodiment, protease activity can be detected by identifying, using standard 
assays {e.g., mass spectrometry, fluorescently labeled antibodies to peptide fragments, or 
loss of fluorescence signal from a fluorescently tagged substrate), peptide fragments that are 
produced by protease activity and released into the media. Thus, activity of 

1 5 group-transferring enzymes can be assayed readily by several approaches using any means 
of detection, which would be appreciated by one of ordinary skill in the art. 

The proteome chips of the invention can be used to identify essentially all binding 
proteins in a species for any compound. Accordingly, proteome chips can be used to 
identify and characterize essentially all proteins that bind, for example, kinases, proteases, 

20 hormones, DNA, RNA, phosphatases, proteases, glycosidases, acetylases, or other group 
transferring enzymes. Thus, the chip can be probed with a probe, and assayed for 
protein-probe interaction and/or assayed for the desired activity. For example, if RNA 
binding is the activity of interest, the proteome chip is probed with RNA, and protein-RNA 
complexes are identified. 

25 For example, the proteome chips of the invention can be used to identify essentially 

all proteins in a species that bind to membrane-associated proteins or other 
membrane-associated compounds by contacting the chip with probes such as, for example, 
whole cells, preparations of plasma membranes, membrane vesicles, or liposome 
comprising membrane components of interest, and detecting protein-probe interaction. In a 

30 particular embodiment, the probe is in the form of a liposome comprising one or more 
phospholipids of interest. The protein-probe interaction can be detected using techniques 
known in the art. The identity of the interactor and/or probe can be determined using 
techniques known in the art. Moreover, biological activities (e.g., enzyme activity, cell 
activation) can also be detected using techniques known in the art. 

35 
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The identity of target proteins from pathogens (e.g., an infectious disease agent such 
as a virus, bacterium, fungus, or parasite) or target proteins from abnormal cells (e.g., 
neoplastic cells, diseased cells, or damaged cells) that serve as antigens in the immune 
response of recovering or non-recovering patients can be determined by using a proteome 

5 chip of the invention. For example, lymphocytes isolated from a patient can be used to 
screen chips comprising all of a pathogen's proteins. In general, these screens comprise 
contacting a proteome chip with a plurality of lymphocytes, wherein the proteins on the 
proteome chip comprise a plurality of potential antigens, and detecting positions on the chip 
where lymphocyte activation occurs. In a specific embodiment, lymphocytes are contacted 

10 with a pathogen's proteins on the proteome chip, after which activation of B-cells or T-cells 
by an antigen or a mixture of antigens is assayed, thereby identifying target antigens derived 
from a pathogen. 

Alternatively, the proteome chips of the invention can be used to characterize an 

immune response by, for example, screening a proteome of an infectious organism with a 
1 5 patient's lymphocytes to identify the targets of a patient's B-cells and/or T-cells. For 

example, B-cells can be incubated with a proteome chip of the invention to identify 

antigenic targets for humoral-based immunity. 

In another embodiment, the proteome chips of the invention can be used to detect 

and characterize substrates of autoimmunity or allergy-causing proteins. For example, a 
20 proteome of human proteins can be screened, with a patient's lymphocytes or with a 

patient's circulating antibodies, to identify the targets of a patient's B-cells and/or T-cells. 

Such screens can characterize autoimmunity or allergic reactions, and identify potential 

target drug candidates. 

In one embodiment, the proteome chips of the invention are used to identify 
25 substances that are able to activate B-cells or T-cells. For example, lymphocytes are 

contacted with the proteome chip, and lymphocyte activation is assayed, thereby identifying 

substances that have a general ability to activate B-cells or T-cells or subpopulations of 

lymphocytes (e.g., cytotoxic T-cells). 

Induction of B-cell activation by antigen recognition can be assayed by various 
30 means including, but not limited to, detecting or measuring antibody synthesis, 

incorporation of 3 H-thymidine, binding of labeled antibodies to newly expressed or 

suppressed cell surface molecules, and secretion of factors indicative of B-cell activation 

(e.g., cytokines). Similarly, T-cell activation in a screen using a protein chip of the 

5 1 

invention can be determined by various assays. For example, a chromium ( Cr) release 
35 assay can detect recognition of antigen and subsequent activation of cytotoxic T-cells (see, 
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e.g., Palladino et al., 1987, Cancer Res. 47:5074-9; Blachere et al., 1993, J. Immunotherapy 
14:352-6). 

The specificity of an antibody preparation can be determined through the use of a 
proteome chip of the invention, comprising contacting the chip with an antibody 

5 preparation, and detecting positions on the solid support where binding by an antibody in 
the antibody preparation occurs. The antibody preparation can be, but is not limited to, Fab 
fragments, antiserum, and polyclonal, monoclonal, chimeric, single chain, humanized, or 
synthetic antibodies. In one example, a proteome chip is probed with a monoclonal 
antibody to characterize its binding strength and/or its specificity. In specific embodiments, 

10 the antibody is directed against cyclin, kinase, GST, Clb5, Cla4, Ste20, Cdc42, PI(3,4)P2, 
PI(4)P, SPA2, CLB1, CLB2, or CdclL 

The proteome chips of the invention are useful for identifying proteins that bind to 
specific molecules of biologic interest including, but not limited to, receptors for potential 
ligand molecules, virus receptors, and ligands for orphan receptors. 

1 5 The proteome chips are also useful for detecting DNA binding or RNA binding to 

proteins on the chips, and for evaluating the binding specificity and strength. The DNA can 
be single-stranded or double-stranded. The RNA can be mRNA, hnRNA, polyA* RNA, or 
total RNA. 

The proteome chips of the invention are useful for identifying proteins that are 

20 modified posttranslationally. Posttranslational modifications that can be detected using the 
methods of the invention include, but are not limited to, methylation, acetylation, 
farnesylation, biotinylation, stearoylation, formylation, lipoic acid, myristoylation, 
palmitoylation, geranylgeranylation, pegylation, phosphorylation, sulphation, glycosylation, 
sugar modification, lipidation, lipid modification, ubiquitination, sumolation, disulphide 

25 bonding, cysteinylation, oxidation, glutathionylation, pyro glutamic acid, carboxylation, and 
deamidation. In addition, a protein can be modified posttranslationally with pentoses, 
hexosamines, N-acetylhexosamines, deoxyhexoses, hexoses, and sialic acid. 

In one embodiment, a proteome chip of the invention can be probed with 
phosphotyrosine, phosphoserine, or phosphothreonine to identify phosphorylated 

30 interactors. In another example, a proteome chip of the invention can be probed with a 
lectin (e.g., wheat germ agglutinin) to identify glycosylated interactors (e.g., 
N-acetylglucosamine). The phosphorylated or glycosylated interactors can be detected 
using methods known in the art. 

The proteome chips of the invention are also useful for identifying and 

35 characterizing protein isoforms that exhibit differences in function, ligand binding, or 
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enzymatic activity. In a particular embodiment, a proteome chip of the invention is used to 
characterize different binding affinities by protein isoforms derived from different alleles by 
assaying their activities relative to one another. 

In one embodiment, consensus sequences can be determined using the proteome 

5 chips of the invention. For example, upon screening a proteome chip with a binder (which 
can be from a species different from that of the proteins on the chip), the amino acid 
sequences of the interactors can be aligned to construct a consensus sequence for the 
interacting domain(s). Such information can be useful, for example, for designing 
inhibitors of the binder-interactor interaction, or for designing novel and/or improved 

1 0 binders for a particular interactor or class of interactors. 

In a specific embodiment, a consensus sequence for calmodulin-binding proteins is 
determined by screening a proteome chip with calmodulin. In a further embodiment, the 
consensus sequence for a calmodulin-binding protein consists of the amino acid sequence 
I/L-Q-X-K-K/X-G-B (SEQ ID NO: 1). In another embodiment, the consensus sequence for 

15 a calmodulin-binding protein comprises the amino acid sequence I/L-Q-X-K-K/X-G-B 
(SEQ ID NO: 1). In a further embodiment, the consensus sequence for a 
calmodulin-binding protein comprises the amino acid sequence I/L-Q-X-K-K/X-G-B (SEQ 
ID NO: 1), and is less than 10, 15, 30, 50, or 100 amino acid residues in length. 

The proteome chips of the invention are also useful for identifying a set of potential 

20 antibacterial, antifungal, antiparasitic or antiviral compounds. For example, cell ly sates or 
other preparations of bacteria, fungi, parasites or viruses can be contacted with a proteome 
array, and protein-probe interactions detected and identified. Additionally, comparing 
interaction patterns obtained from infectious organisms at infectious stages and 
non-infectious stages can identify interactions involved in infection of the host. 

25 Screening of phage display libraries can be performed by incubating a library with 

the proteome chips of the present invention. The detection of clones binding to a protein on 
the chip can be conducted by various methods known in the art (e.g., mass spectrometry), 
thereby identifying clones of interest, after which the DNA encoding the clones of interest 
can be identified by standard methods (see, e.g., Ames et al., 1995, J. Immunol. Methods 

30 184:177-86; Kettleborough et al., 1994, Eur. J. Immunol. 24:952-8; Persic et al., 1997, 
Gene 187:9-18). In this manner, the chips are useful to select for cells having surface 
components that bind to specific proteins on the chip. 



35 
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5.5 Kits. 

The invention also provides kits for carrying out the assay regimens of the invention. 
In one embodiment, kits of the invention comprise one or more proteome chips of the 
invention. Such kits may further comprise, in one or more containers, reagents useful for 

5 assaying biological activity of a protein or molecule, reagents useful for assaying 

protein-probe interaction, and/or one or more probes, proteins or other molecules. The 
reagents useful for assaying biological activity of a protein or other molecule, or assaying 
interactions between a probe and a protein or other molecule, can be applied with the probe, 
attached to a proteome chip, or contained in one or more wells on a proteome chip. Such 

1 0 reagents can be in solution or in solid form. The reagents may include either or both the 
proteins or other molecules and the probes required to perform the assay of interest. 

The proteins of the proteome chip can be attached to the surface of a flat solid 
support, contained in wells on a solid support, or attached to the surface of wells on the 
solid support. In one embodiment, the proteome chip in the kit has the proteins and/or 

15 probes already attached to the solid support. In another embodiment, the proteome chip in 
the kit can have the reagent(s) or reaction mixture useful for assaying biological activity of a 
protein or other molecule, or useful for assaying the interaction of a probe and a protein or 
other molecule, already attached to wells on the solid support. In yet another embodiment, 
the reagent(s) is not attached to the wells of the solid support, but is contained in the wells. 

20 In yet another embodiment, the reagent(s) is not attached to the wells of the solid support, 
but is contained in one or more containers, and can be added to the wells of the solid 
support. In yet another embodiment, the kit further comprises one or more containers 
holding a solution reaction mixture for assaying biological activity of a protein or molecule. 
In yet another embodiment, the kit provides a substrate (e.g., beads) to which probes, 

25 proteins or molecules of interest, and/or other reagents useful for carrying out one or more 
assays, can be attached, after which the substrate with attached probes, proteins, or other 
reagents can be placed into the wells of the chip. 

5.6 Design of a Positive Identification Algorithm. 

30 The present invention also provides a method for identifying whether a signal is 

positive. Signals can be in any measurable form including, but not limited to, visible light, 
ultraviolet radiation, infrared radiation, X-rays, fluorescence, and colorimetric visualization. 
In a preferred embodiment, signals are detected by mass spectrometry. 

35 
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In addition, signals can be in any arrangement, and in any physical format such as, 
but not limited to, arrays, blots, gels, and screens. Preferably, signals are spots in a static 
arrangement. 

Li another preferred embodiment, signals are produced by fluorescence and are 

5 arranged in a grid pattern on an array. As such, a signal can be assigned a positional 
coordinate with respect to row and column. The rows and columns can be of any width. 

The first step in filtering signals is to calculate the local foreground and background 
signals for each spot. The local foreground signal is emitted from the actual spot, whereas 
the local background signal is emitted from the area immediately bordering to the spot. The 

10 net signal, which is the local foreground signal minus the local background signal, is used in 
all further calculations. The local foreground and background signals can be identified by 
software such as GENEPIX™. 

However, variations between chips, which can represent, for example, different 
lipid-binding experiments, and local variations on the chip (due to unequal diffusion of 

1 5 substrates, for example) can result in further fluctuation of the net signal intensity, resulting 
in different net signal distributions for different chips. To correct the variation between 
chips, the net signals from different chips need to be scaled into a common range. One of 
the several chips is chosen as a reference and the goal is to scale the net signal distributions 
of each chip to the range and shape of the net signal distribution of the reference chip. 

20 For example, the lower quartile, median, and upper quartile values of the net signal 

distribution of each chip can be computed. Then, for each chip, the median net signal is 
subtracted from the net signal of each spot. Furthermore a scaling factor is computed for 
each chip, which is equal to the ratio of the difference between the upper and lower quartile 
of the specific chip and the difference between the upper and lower quartile of the reference 

25 chip. This implies that the scaling factor for the reference chip is equal to one. Then the net 
signals on each chip are multiplied by the chip-specific scaling factor to calculate the scaled 
net signals. 

To correct for local variations on an array, a "neighborhood subtraction" for each 
spot can be performed. For example, the neighborhood region can be defined as a region of 

30 two rows above and below, as well as two columns to the left and right of a signal spot. 
The median signal of this region is then subtracted from the spot signal to calculate an 
excess signal relative to the neighborhood of the spot. Preferably, the number of spots of 
high signal strength in any neighborhood region is sufficient low, such that the median 
value is not significantly affected and is a good representation of the background signal in 

35 the neighborhood region. 
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Applying the neighborhood subtraction to the scaled net signals yields the scaled 
excess signals. In the next step, parallel samples are compared with respect to their scaled 
excess signals. If the difference between the average of the scaled excess signal of the two 
parallel samples and the scaled excess signal of one of the parallel samples is greater than 

5 three times the standard deviation of the error of the scaled excess signal, the spots 

belonging to the two parallel samples are excluded from further analysis. The remaining 
spots and their scaled excess signal then represent the set of filtered signals. 

The distribution of the error of the scaled excess signals and its standard deviation 
can be computed as follows. A linear regression is performed on the complete set of 

10 parallel samples to determine the general linear relationship between parallel samples. Then 
an error value can be calculated for each parallel sample: 
s G = |G2 - Gm|, 
where 

Gl and G2 represent the two scaled excess signals of the two parallel samples, 

15 Gm-(Gl+G2)/2, 

and function f(Gl) = a*G2 + b is the general linear relationship between the two parallel 
samples Gl and G2, with the parameters a and b determined by linear regression. 

The complete set of error values is the set of error values for all parallel samples and 
represents the distribution of error values for the scaled excess signal. Then the standard 

20 deviation of this distribution can be calculated. 

Finally, a pair of parallel samples is called positive if the average of their filtered 
signals is three standard deviations greater than the error of the scaled excess values. (Note 
that this threshold is independent of the threshold to determine whether parallel samples 
should be excluded from the set of filtered signals.) 

25 After this filtering procedure, the filtered signal (G) is normalized with the GST 

signal (R), yielding the ratio r = G/R which can be a measure of the binding per amount of 
protein and can allow for comparison of binding signals between different proteins. The 
specific binding ratio r is sensitive to errors s G and in both the G and R signals. Using a 
Monte-Carlo procedure, 90% and 95% confidence intervals for this ratio can be calculated. 

3 ^ The error e G of the r value is related to the errors s G and e^: 

r+s^ = (G + s G )/(R+s^) 
where £ R represents the error of the ratio r. 

For the Monte Carlo procedure, both the distributions of e G and must be known. 
The distribution of e G can be computed as explained above. The distribution of e^ can be 
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computed in the same procedure as for e G by using the GST signal pairs of parallel samples 
as input. 

The Monte Carlo procedure is as follows: 
In order to determine confidence intervals for r + s R , a population of random samples of r + 
e^ needs to be computed. These can be derived from random samples of s G and 8^. 
Random samples of e G can be computed as follows. Samples of uniformly distributed 
random numbers between 0 and 1 are calculated with a standard random number generator. 
From the distributions of s G the inverse cumulative distribution function of e G is determined 
by standard procedures. Using the samples of random numbers as arguments for this 
function produces a set of random samples for e G . Likewise a set of random samples for s G 
can be calculated. These random samples of e G and can be combined in the formula r + 
= (G + s G )/(R + £tf) to produce a population of random samples for r + s^. 

In one embodiment, the invention provides a method for identifying whether a signal 
is positive, comprising the steps of determining foreground and background signals for each 
spot locally and determining net signals from their difference, determining the lower 
quartile, median, and upper quartile values of a first and second net signal distribution, 
subtracting the first median value from the first net signal distribution, and subtracting the 
second median value from the second net signal distribution to obtain a first and second 
subtracted value, respectively, dividing said first subtracted value by the difference between 
said upper and lower quartile values of said first signal distribution, and dividing said 
second scaled value by the difference between said upper and lower quartile values to obtain 
a first and second scaled value, respectively, computing a local median value of a scaled 
signal distribution of a neighborhood region, wherein said neighborhood region comprises a 
plurality of sites in the area; subtracting the local median value from the scaled signal to 
obtain an scaled excess value; and parallel samples of scaled excess values are excluded if 
the difference between one of the sample values and their average is greater than three 
standard deviations of the error of the scaled excess value. 

The filtered values can be used to identify whether a signal is positive. Parallel 
samples are called "positive" if the average of the filtered values of the parallel sample is 
three times greater than the standard deviation of the error of scaled excess values. Filtered 
positive signals can then be normalized using the formula: 
r = G/R 

wherein G is the filtered value, R is a GST signal, and r represents a signal per amount of 
protein that can be compared among different spots. The ratio r is sensitive to the errors in 
G and R. This sensitivity can be assessed by calculating confidence intervals of 
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r + s r = G + e G / R + s R 

s G is the error of G, s R is the error of R, and s r is the error of r. 

In a specific embodiment, a positive signal indicates protein-probe interaction. 
In another specific embodiment, the neighborhood region is two rows above, two rows 
5 below, two columns to the left, and two columns to the right of the signal. 

In another specific embodiment, data points from two parallel samples are excluded from 
further analysis, wherein the difference between the scaled excess signals of said samples 
and their average is greater than three standard deviations of the error of the scaled excess 
signal. Data points are also excluded if they are obviously artifactual. 

10 

6. EXAMPLES 

A defined collection of over 5800 proteins from the budding yeast was prepared 
using high-throughput techniques and screened for many activities including 

15 protein-protein, protein-DNA, protein-RNA, and protein-liposome interactions. A large 
number of novel activities were identified, providing new information concerning known 
and previously uncharacterized genes. 

To facilitate studies of the yeast proteome, 5800 open reading frames were cloned 
and overexpressed, and their corresponding proteins purified. The proteins were printed 

20 onto slides at high spatial density to form a yeast proteome microarray and screened for 
their ability to interact with proteins and phospholipids. Many novel calmodulin and 
phospholipid-interacting proteins were identified; a common potential binding motif was 
identified for many of the calmodulin-binding proteins. These studies demonstrate that 
microarrays of an entire eukaryotic proteome can be prepared and screened for large 
numbers of biochemical activities resulting in the identification of many novel protein 
functions/interactions. These microarrays can also be screened to detect protein 
posttranslational modifications. 

6.1 Yeast Culture Preparation 

30 1 . Yeast glycerol stocks stored in 96-well plates at -80 °C were inoculated onto a URA- 
agar plate (Omni, USA) using a 96-pronger. 
2. The culture was allowed to grow on agar at 30°C for 48 hours, and visible colonies 
(2 mm diameter) can be observed. 

35 
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3. A 96-pronger was used to inoculate yeast cells from agar plates to a 96-well 2 ml 
box in which every well contained 300 \i\ URA-/raffinose liquid media and a 2 mm 
diameter glass ball, which facilitates the uniform growth. 

4. After the culture reached O.D. 600 4.0 in about 16 hours at 30°C with vigorous 
shaking (300 rpm), 15 (il of the same strain was inoculated into 750 \x\ of 
URA-/raffinose liquid media in four different boxes to obtain 3 ml of culture. 
Again, each well contained the same glass ball to achieve aeration and even growth. 
The cells were grown at 30 °C with vigorous shaking. 

5. After 12-16 hours of growth, the culture should reach O.D. 600 0.6 to 0.8. Cultures 
were discarded if the OD is over 1.0. Using an automated liquid-handling device 
(Q-Fill, Genetix, UK), 40% galactose stock was added to each well to a final 
concentration of 2% to induce the cells. The cultures were induced at 30° C for 4 
hours with shaking. 

6. The cells were harvested by spinning at 3000 rpm for 2-10 min, and the cell pellets 
were resuspended in 100-1000 \xl of cold water by vortexing. Cells of the same 
strain were then merged from 4 wells into one. Cells were collected by spinning and 
resuspended in 100-1500 jal of cold lysis buffer without the protease inhibitors on 
ice. The washed cells were collected by a brief centrifugation, and the lysis buffer 
was discarded. The washed semi-dry culture was immediately stored in -80 °C 
freezer. The culture can be kept for weeks. 

6.2 Protein Purification in a 96-well Fashion 

1 . The frozen culture in a 96-well box was transferred from -80 °C to ice and 
100-300 pi of zirconia beads (0.5 mm diameter from BSP, Germany) was added to 
each well. While the culture was still frozen, 100-500 jal of lysis buffer containing 
fresh protease inhibitors was added. A cap mat was used to seal each well. After 
thawing the culture for 5-25 min on ice, the cells in the 96-well box were vortexed 
20-60 seconds for 3-6 times with 1-5 min intervals on ice. To efficiently disrupt the 
yeast cell wall, and to process many plates at once, a paint shaker (HARBIL™ 5G 
HD, 36 kg capacity, adjustable pressure and shaking time, fixed speed at 200 times 
per minute) was used to violently agitate the samples. 

2. After spinning at 3000 rpm for 2-10 min, the supernatant was collected using 
wide-open tips (Fisher, USA) and transferred into a 96-well filter plate (Whatman, 
USA; Whatman UNIFILTER™ , Cat. No. 7700-1 801 having a hydrophilic PVDF 
filter with an 800 fil/well capacity), which was placed on top of a 96-well box. 
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3. To obtain more proteins, 100-500 jul of lysis buffer containing fresh protease 
inhibitors was added to the cell debris, and Steps 1 and 2 were repeated. 

4. The combined cell lysate was spun through the filter plate into a cold and clean 
96-well box for 10-30 min at 3000 rpm. The volume of filtered lysate in each well 
was roughly 200-1000 [il. 

5. Meanwhile, the required amount of glutathione beads (roughly 10-50 jlxI of beads per 
sample) (Amersham, USA) was washed four times with cold lysis buffer without the 
protease inhibitors, and finally resuspended in 5X of its original volume with lysis 
buffer containing fresh protease inhibitors. 

6. 100 [xl of washed glutathione beads was added to each well and sealed tightly with a 
cap mat. The beads were incubated with the lysate on a roller drum at 4°C for one 
hour. To obtain the best mixing, the boxes were rotated 360 degrees on the roller 
drum. 

7. The beads were collected by spinning at 3000 rpm for 10-60 seconds, and the 
supernatant was discarded. Beads were washed once with 200-800 [il of Wash 
Buffer I containing protease inhibitors, and twice without the inhibitors. 

8. The beads were then washed three times with 200-800 jliI of Wash Buffer II. After 
complete removal of the buffer, 20-50 jlxI of Elution Buffer was added to each well. 
Filter plates used for the elution step comprised materials having low affinity for 
proteins (MILLIPORE MULTISCREEN™ , Cat. No. MADVN6550 having a 
hydrophilic PVDF filter with a 200 |ul/well capacity). The box was vortexed briefly 
to resuspend the beads and incubated on a roll drum for one hour at 4°C. 

9. The eluate/beads slurry was transferred to a cold filter plate (Millipore, USA), and 
the eluate was collected to a 96-well PGR plate by spinning through the filter plate 
for 0.5-2 min at 3000 rpm at 4°C. 

10. Each purified protein was aliquoted into three 96-well PGR plates and immediately 
stored in a -80 °C freezer. 



_ Lysis Buffer 
30 
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30-300 mM TrispH7.5 

50-300 mM NaCl 

0.1-lOmM EGTA 

0.01-1.0% TritonX-100 

0.0 1 - 1 % beta-mercaptoethanol ("BME") 

0. 1-3 mM phenylmethylsulfonyl fluoride ("PMSF") 
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Roche Protease inhibitor tablets (containing EDTA) 
BME, PMSF, and inhibitor tablets were added freshly. 

Wash Buffer I: 
5 30-300 mM TrispH7.5 

300-600 mM NaCl 

0.1-10mM EGTA 
0.01-1.0% TritonX-100 
0.01-1% beta-mercaptoethanol ("BME") 
10 0.1-3 mM PMSF 

Roche Protease inhibitor tablets (containing EDTA) 
BME, PMSF, and inhibitor tablets were added freshly. 

Wash Buffer II: 

15 50-200 mM HEPES pH 7.5 

50-300 mM NaCl 

1-15% Glycerol 

Elution Buffer: 
20 50-200 mM HEPES pH 7.5 

50-200 mM NaCl 
20-40% Glycerol 
5-40 mM Glutathione (Reduced form) 
Elution buffer should be about pH=7.5. 

25 

6.3 Method of Making Proteins in a High-Throughput 96-Array Format 

A yeast proteome microarray containing nearly all yeast proteins was prepared and 
screened for a number of biochemical activities. A high-quality collection of 5800 yeast 
ORFs (93.5% of the total) was cloned into a yeast high-copy expression vector using 

30 recombination cloning (Mitchell et al., 1993, Yeast 9:715). The yeast proteins were fused 
to GST-HisX6 at their amino termini and expressed in yeast under the control of a 
galactose-inducible GAL1 promoter (Zhu et al., 2000 5 Nat. Genet. 26:283; Mitchell et al., 
1993 Yeast 9:715). The yeast expression strains contain individual plasmids in which the 
correct yeast ORFs have been shown to be properly fused in-frame to GST by DNA 

35 sequencing. Briefly, yeast ORFs were amplified by PCR and co-transformed into yeast cells 
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along with the vector to generate expression clones. The plasmids were rescued in E. coli, 
and the vector-insert junctions were sequenced (FIG. 1 A). If the ORF cloned was not the 
ORF of interest, or a frameshift was detected, the cloning cycle was repeated. Once a 
construct was confirmed, the plasmid DNA was reintroduced into yeast and E. coli to create 
5 permanent stocks for future analyses (Zhu et al., 2000, Nat Genet. 26:283). By repeating 
the cloning cycle, 5800 unique yeast ORFs were successfully cloned, representing 93.5% of 
the total. 

To generate purified proteins for biochemical analysis, a robust and high-throughput 
purification method for preparing proteins in a 96-well format was developed and 

10 optimized. Using glutathione-agarose beads, yeast extracts were prepared, and fusion 

proteins were purified (for details of 96-well format protein purification protocol, a full list 
of results from all the experiments, and the design of the positive identification algorithms, 
visit public web site spine.mbb.yale.edu/protein_chips/). The lysis buffer and initial washes 
contained 0.1% Triton to ensure that the purified proteins were free of lipids. Using the 

1 5 procedures of the invention, at least 1152 protein samples can be prepared from cells in 
under 10 hours. 

The quality and quantity of the purified proteins were monitored using immunoblot 
analysis of 60 random samples (FIG. IB). Greater than 80% of the strains produced 
detectable amounts of fusion proteins of the expected molecular weight. A manual printing 
20 tool was used to spot 3 nl of 1 152 purified proteins in duplicate onto glass slides (19), and 
the proteins detected using polyclonal anti-GST antibodies. Greater than 85% of the 
samples exhibited a visible signal above background, consistent with the immunoblot 
analysis. Using the procedures of the invention, fusion proteins from 6144 (64X96-well 
boxes) yeast strains can be purified in under two weeks. 

25 

6.4 Method of Making a Proteome Microarray 

To prepare the proteome chips, 6566 protein preparations representing 5800 
different yeast proteins were printed in duplicate onto glass slides using a commercially 
available microarrayer. As a control, known amounts of GST were also printed. Two types 

30 of slides were employed. Aldehyde-treated microscope slides were used in initial 

experiments (MacBeath et al., 2000, Science 289:1760), in which case fusion proteins were 
attached to the slide surface through primary amines at the N-termini or other residues of 
the fusion proteins, resulting in a relatively random orientation of proteins on the surface. 
Proteins were spotted onto nickel-coated slides in subsequent experiments. In this case, 

35 fusion proteins are attached through their HisX6 tags such that the cloned yeast portions of 
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the fusion proteins are essentially uniformly oriented away from the surface. Although both 
types of slides were successfully used, the nickel-coated slides gave significantly superior 
signals for our particular protein preparations (FIG. 2A). 

To determine how much fusion protein was covalently attached to different glass 

5 surfaces, and assess the reproducibility of the protein attachment, chips were probed with 
anti-GST antibodies. Over 93.5% of the protein samples gave signals significantly above 
background (i.e. , greater than approximately 10 fg of protein). A comparison with known 
amounts of GST also printed on the slide, indicated that about 90% of the spots contain 
approximately 10 fg to 950 fg of protein. FIG. 2 A shows that detection of proteins on a 

1 0 proteome chip with fluorescently labeled antibodies is extremely sensitive, i.e. , the 
signal-to-noise ratio is high despite that only 1/10,000 of purified proteins from a 3 -ml 
culture is spotted on the slide. The results demonstrate that it is feasible to spot 13,000 
protein samples in one half the area of a standard microscope slide (2.5 cm by 7.5 cm) with 
excellent resolution (FIGS. 2A and 2B). To test the reproducibility of the protein spotting, 

15 the signals from each pair of duplicated spots were compared with one another. As shown 
demonstrated by the sharp spike in FIG. 2C, 95% of the signals were within 5% of the 
average (10). 

6.5 Method of Using a Proteome Microarray 

20 Proteome chips were tested by probing for several exemplary types of biological 

activities: protein-protein interactions, protein-nucleic acid interactions, and protein-lipid 
interactions. 

Generally, proteome chips were prepared for assays as follows. The proteome chips 
were blocked by slowly immersing the printed glass slides into either BSA (1-3% (w/w) 

25 BSA in PBS buffer; SIGMA™, USA) or glycine blocking buffer (30-300 mM glycine; 
50-300 mM Tris, pH 6.5-8.5; 50-300 mM NaCl; SIGMA™, USA) with the protein side up. 
The buffer was filtered through a 2 jam filter unit to remove particles. Glycine is preferable 
when probing for carbohydrate-binding proteins. The slides were incubated in the blocking 
buffer at 4°C overnight without any shaking (disturbance of the blocking buffer may result 

30 in the protein streaks on the glass surface). 

Probe proteins were generally prepared as follows. Yeast proteins were purified by 
affinity column using glutathione beads from 50 ml culture using standard protocols without 
the elution from the beads. The protein beads were washed three to five times with cold 
PBS buffer (pH 8.0) (SIGMA™, USA). Approximately 1 ml of Sulfo-NHS-LC-LC-Biotin 

35 (PIERCE™ Cat No. 21338, USA) dissolved in PBS (pH 8.0) at a concentration of 0.1-50 
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mg/ml was added to the glutathione beads and incubated at 4°C for 2h. The beads were 
washed 5 times with cold PBS buffer (pH 8.0) and eluted with 100-500 ml of the elution 
buffer (50-200 mM 5 HEPES pH 7.5; 50-200 mMNaCl; 20-40% glycerol; 5-40 mM 
glutathione). Protocols resulting in more weakly biotinylated proteins are preferred. 
5 Batches of proteins that are biotinylated to different degrees were pooled for future usage. 

6.5.1 Identification of Calmodulin-Interacting Proteins 

To test for protein-protein interactions, the yeast proteome was probed with 
calmodulin (11). Calmodulin is a highly conserved calcium-binding protein involved in 

10 many calcium-regulated cellular processes and has many known physical partners (Hook et 
al., 2001, Annu. Rev. Pharmacol. Toxicol. 41 :471). The calmodulin probe was biotinylated, 
and bound probe was detected using Cy3 -labeled streptavidin. As a control, the yeast 
proteome was probed with Cy3-labeled streptavidin alone. 

Generally, protein-protein interactions can be assayed as follows. Blocked proteome 

15 chips are washed three to five times in PBS buffer, and the extra liquid on the glass surface 
is removed by tapping the slides vertically on a KIMWIPE™. 200 \xl of biotinylated protein 
probe is added to the proteome chip and immediately covered by a hydrophobic plastic 
cover slip (GRACE BIO-LABS™, USA). After trapped air bubbles are removed, the chip 
is incubated in a humidity chamber at room temperature (RT) for one hour. To remove the 

20 cover slip, the chip is immersed into large volume of PBS buffer (>50 ml), and the slip 
should float away. The chip is then moved to a second PBS bath (>50 ml) and washed 3 X 
5 min with shaking at RT. After removing extra liquid on the surface of the chip, at least 
150 ill of Cy3- or Cy5- conjugated streptavidin (PIERCE™, USA; 1:2000 to 1:4000 
dilution) is added to the surface and covered by a hydrophobic plastic cover slip (GRACE 

25 BIO-LABS™, USA). The chip is incubated for greater than 30 min in the dark at RT. The 
chip is then washed as described above. To completely remove the liquid on the chip, the 
chip is spun to dryness at 1500-2000 rpm for 5-10 min at RT. 

If a proteome chip is to be screened with antibodies, the protein-antibody interaction 
can be detected as follows. Blocked proteome chips are washed three to five times in PBS 

30 buffer, and the extra liquid on the glass surface is removed by tapping the slides vertically 
on a KIMWIPE™ . 200 \x\ of primary antibodies (properly diluted in PBS containing 1-3% 
BSA and 0.1% TritonX-100) is added to the proteome chip and immediately covered by a 
hydrophobic plastic cover slip (GRACE BIO-LABS™, USA). After trapped air bubbles are 
removed, the chip is incubated in a humidity chamber at RT for one hour. To remove the 

35 cover slip, the chip is immersed into large volume of PBS buffer (>50 ml), and the slip 
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should float away. The chip is then moved to a second PBS bath (>50 ml) and washed 3 X 
5 min with shaking at RT. After removing extra liquid on the surface of the chip, >150 jj.1 
of Cy3- or Cy5-conjugated secondary antibodies (properly diluted in PBS containing 1-3% 
BSA and 0.1% TritonX-100) is added to the surface and covered by a hydrophobic plastic 

5 cover slip (GRACE BIO-LABS™, USA). The chip is incubated for greater than 30 min in 
the dark at RT. The chip is then washed as described above. To completely remove the 
liquid on the chip, the chip is spun to dryness at 1500-2000 rpm for 5-10 min at RT. 

When biotinylated calmodulin was used to probe the proteome chips in the presence 
of calcium, six known calmodulin targets, namely Cmklp, Cmk2p, Cmp2p, Dstlp, Myo4p, 

10 andArc35p were identified (FIG. 3A). Cmklp and Cmk2p are type I and II 

calcium/calmodulin-dependent serine/threonine protein kinases, which are both involved in 
the signal transduction pathway in the mating response (Hook et aL, 2001, Annu. Rev. 
Pharmacol. Toxicol. 41). Cmp2p (Cna2p) is one of the two yeast calcineurins, and has been 
demonstrated to interact with calmodulin in vivo (Cyert et aL, 1991, Proc. Natl. Acad. Sci. 

15 U.S.A. 88:7376), Dstlp plays a role in transcription elongation (Stirling et al, 1994, EMBO 
J. 13:4329), Myo4p is a class V myosin heavy chain required for proper localization of 
ASH1 transcript (Bohl et al., 2000, EMBO J. 19:5514; Bertrand et al., 1998, Mol. Cell 
2:437). Arc35p is a component of the Arp2/3 actin-organizing complex, which is involved 
in actin assembly and function, and in endocytosis (Winter et al., 1999, Proc. Nat. Acad. 

20 Sci. U.S.A. 96:7288). Arc35p was recently shown to interact with calmodulin in a 
two-hybrid study (Schaerer-Brodbeck et aL, 2000, Mol. Biol. Cell 11:1113), thus 
confirming the data herein demonstrating that Arc35p and calmodulin interact in vitro. Of 
the six known calmodulin targets that were not detected, two were not represented in the 
collection and the remaining four were not detectable in the GST probing experiments. In 

25 addition to known interactors, the calmodulin probe identified 33 additional interactors. 
These interactors include a wide variety of different types of proteins (see Table 1; public 
web site bioinfo.mbb.yale.edu/proteinchip), consistent with a role for calmodulin in many 
diverse cellular processes. 

In addition to the calmodulin-binding targets, one protein, Pyclp, which bound 

30 Cy3-labeled streptavidin, was also identified. Pyclp encodes a pyruvate carboxylase 1 
homolog that contains a highly conserved biotin-attachment region (Menendez et al., 1998, 
Yeast 14:647). Thus, as predicted by its sequence, Pyclp is biotinylated in vivo; and was 
identified in all experiments using streptavidin detection methods. Thus, proteome 
microarrays can be used to identify posttranslational modifications of proteins. 

35 
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6.5.2 Identification of a Calmodulin-binding Motif 

To identify putative calmodulin-binding domains, amino acid sequences shared 
among the different calmodulin-binding targets (i.e., interactors) were determined (Zhu et 
al. 5 2000, Nat. Genet. 26:283). Fourteen of the 39 calmodulin-binding proteins contain a 

5 motif whose consensus is I/L-Q-X-X-K-K/X-G-B (SEQ ID NO: 1), where X is any residue 
and B is a basic residue (FIG. 3B). A related sequence in myosins, 
I-Q-X-X-X-X-K-X-X-X-R (SEQ ID NO: 16), has been shown previously to bind 
calmodulin (Homma et al., 2000, J. Biol. Chem 275). The results demonstrate that the 
motif, I-Q-X-X-X-X-K-X-X-X-R (SEQ ID NO: 16), is found in many calmodulin-binding 

1 0 proteins. Other calmodulin-binding interactors that lack this motif can possess other 
calmodulin-binding sequences. 

6.5.3 Identification of a ATP/GTP-binding Proteins 

1 . Blocked proteome chips are washed three to five times in cold PBS buffer, and the 
1 5 extra liquid on the glass surface is removed by tapping the slides vertically on a 

KIMWIPE™. 

2. 100 \xl of ATP and GTP solution (0.5-5 mM ATP-Cy3, 0.5-5mM GTP-Cy5, 
100-400 mM NaCl, 1-30 mM MgCl 2 , 50-300 mM Tris, pH 7-8.5) is added to the 
proteome chip and immediately covered by a hydrophobic plastic cover slip 

20 (GRACE BIO-LABS™, OR, USA). After trapped air bubbles are removed, the chip 

is incubated in a humidity chamber at 4°C for one hour. 

3. To remove the cover slip, the chip is immersed into large volume (>30 ml) of ice 
cold wash buffer (100-400 mM NaCl, 1-30 mM MgCl 2 , 50-300 mM Tris, pH 7-8.5), 
and the cover slip should float away. The chip is then moved to a second cold wash 

25 bath (>50 ml) and washed 3X5 min with shaking at 4 °C. 

4. After removing extra liquid on the surface of the chip, the chip is spun to dryness at 
1500-2000 rpm for 5-10 min at 4°C. 

6.5.4 Identification of DNA-binding Proteins 

30 Protein-DNA and protein-RNA interactions are important for many fundamental 

biological functions such as transcription regulation, chromosome segregation and 
maintenance, and RNA transport and processing (Williamson, 2000, Nat. Struct. Biol. 
7(1 0):834-7). To explore the possibility of using a proteome chip to identify proteins that 
bind to DNA or RNA molecules, total yeast genomic DNA was labeled using Cy3-CTP and 
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Klenow fragment, and polyA + was labeled using a biotinylated psoralen-derivative (31). 
Each probe was incubated with a different proteome chip (FIG. 3A). 

The protein-nucleic acid interaction was assayed as follows. The nucleic acid 
probes were labeled as described by Winzeler et al. (1999, Science 285:901). Blocked 

5 proteome chips were washed three to five times in PBS buffer, and the extra liquid on the 
glass surface was removed by tapping the slides vertically on a KIMWIPE™. 200 |j,l of 
labeled nuclei acid probe (50-200 mM poly(dC/dG); 100-300 mM NaCl; 50-300 mM 
HEPES, pH 7.0-8.5; 10 mM MgCl 2 ) was added to the proteome chip and immediately 
covered by a hydrophobic plastic cover slip (GRACE BIO-LABS™, USA). After trapped 

10 air bubbles were removed, the chip was incubated in a humidity chamber in the dark at 
room temperature ("RT") for one hour. To remove the cover slip, the chip was immersed 
into large volume of PBS buffer(>50 ml), so that the slip would float away. The chip was 
then moved to a second PBS bath (>50 ml) and washed 3X5 min with shaking in the dark 
at RT. To completely remove the liquid on the chip, the chip was spun to dryness at 

1 5 1 500-2000 rpm for 5- 1 0 min at RT. 

Visual inspection revealed that the genomic DNA probe identified 41 DNA-binding 
proteins (see Table 1). These included eight previously known DNA-associated proteins 
and two predicted DNA-binding proteins. Of the known DNA-binding proteins, we found 
three DNA-repair proteins (z.e., xrs2p, Cdclp, and Rad51p (Costanzo et al., 2001, Nucleic 

20 Acids Res. 29:75)), a SWI-SNF global transcription activator complex component (Le. 9 Snfl 
lp (Costanzo et al. 2001, Nucleic Acids Res. 29:75)), the alpha-subunit of the NC2 
(Drl/Drapl) repressor of class II transcription Bur6p (Sternberg, et al., 1987, Cell 48:567), 
the transcription factor Spt2p (a negative regulator of HO gene transcription (Sternberg et 
al., 1987, Cell 48:567)), and Met30p which contains five WD-repeats and binds certain 

25 promoter regions (Thomas et al., 1995, Mol. Cell Biol. 15:6526). We also identified 
Trm2p, which binds both DNA and RNA and is involved in DNA repair and RNA 
processing/modification (Sadekova et al., 1996, Curr. Genet. 30(l):50-5.). 

Eight proteins, identified as nucleic-acid binding proteins, have unknown functions. 
Of these, three are likely to possess DNA-binding activities based on their amino acid 

30 sequences and the results of the instant assay. Ycr087c contains a zinc-finger domain, a 
common DNA-binding motif. Yhr054c shares sequence homology with the transcription 
factor Rsc3p (Costanzo et al., 2001, Nucleic Acids Res. 29:75). Ybr025c is believed to 
have putative nucleotide-binding activity (Costanzo et al., 2001, Nucleic Acids Res. 29:75). 
The instant assay shows that these eight proteins bind DNA in vitro and likely bind nucleic 

35 acids in vivo. 
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6.5.5 Identification of RNA-binding Proteins 

In addition to Trm2p, other interactors with RNA-binding activity were identified. 
For example, an isoleucyl-tRNA synthetase (Ilslp) and a ribosomal protein (Rpl41B) have 
both been shown to bind RNA molecules (Lanker et al., 1992, Cell 70:647; Suzuki et aL, 

5 1 990, Curr. Genet 17:1 85). Other proteins involved in translation tested positive in the 
screen for RNA-binding proteins such as Rpl41b, Sqtlp, and Gcdl lp. Rpl41b is a 
component of the large ribosomal subunit. Sqtlp interacts with RpllOp, which is an 
RNA-binding protein (Costanzo, 2001 , Nucleic Acids Res. 29:75). Gcdl lp, a gamma 
subunit of translation initiation factor eIF2 has GTP-binding activity, and binds to the 

10 Lsml-7 complex (U6-specific snRNP) as demonstrated by two-hybrid analysis 

(FromontRacine et al., 2000, Yeast 17:95). Two other protein targets that bind the DNA 
probe also bind nucleotides, i.e., Thil3p is involved in nucleotide metabolism, and Ypt31p 
has GTP-binding activity (Costanzo et al., 2001, Nucleic Acids Res. 29:75). The 
RNA-binding and nucleotide-binding proteins identified with the DNA probe may reflect 

1 5 low affinity interactions with DNA or, alternatively, may represent normal affinity for DNA 
that has been previously unrecognized. 

RNA-polyA + from exponentially growing cells was labeled with biotin and used to 
probe the proteome microarray. The positive signals were detected using Cy3-streptavidin. 
Nineteen target interactors were identified (see Table 1), including a Ul snRNP component 

20 (Smdlp), a protein known to bind mRNA import protein (Sxm8p), a C2-H2 zinc-finger 
protein (Azflp), and Tos8p, which was suggested by its sequence to be involved in RNA 
splicing and processing (Schwikowski et al., 2000, Nat. Biot. 18:1257). In addition to 
proteins that likely directly or indirectly interact with RNA, we found two transcription 
factors (Bur6p and Tos8p) 5 one transcriptional silencer (Hst3p), six other proteins encoded 

25 by known ORFs, and seven proteins from uncharacterized ORFs. One of the unknown 
ORFs exhibited the highest signals (Yerl52c). It is important to note that only one 
interactor, Bur6p, bound both the DNA and RNA probes. The other interactors were 
uniquely detected with either a double-stranded DNA probe or a RNA probe, suggesting 
that most proteins are likely to specifically bind either DNA or RNA. 

30 In summary, of the 41 target interactors, eighteen are nucleic-acid interacting 

proteins; eleven are known or putative DNA-binding proteins, three are RNA-binding 
proteins, and four bind nucleotides. 
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6.5.6 Identification of Lipid-binding Proteins 

The proteome chips of the invention are valuable for identifying activities that might 
not be accessible by other experimental approaches such as, for example, protein-drug 
interactions and protein-lipid interactions. Indeed, genome- wide analysis of proteins that 

5 bind to phospholipids has not been explored previously for any species. A proteome chip of 
the invention was used to study proteins that interact with phosphatidylinositol ("PI"). In 
addition to their roles as constituents of cellular membranes, phosphatidylinositols are 
important second-messengers that regulate diverse cellular processes, including growth, 
differentiation, cyto skeletal rearrangements, and membrane trafficking, and are found in the 

10 nucleus, vacuole, and plasma membrane (Odorizzi et al., 2000, TIBS 25:229; Fruman et al., 
1998, Annu. Rev. Biochem. 67:481; Martin, 2000, Annu. Rev. Cell Dev. Bio. 14:231; Wera 
et ah, 2001, FEMS Yeast Res. 1406:17). Because they are often present only transiently 
and in low abundance within cells, phosphatidylinositols have not been characterized 
extensively in yeast, and consequently, little is known about proteins that bind particular 

15 phospholipids (Odorizzi et al, 2000, TIBS 25:229; Fruman et al., 1998, Annu. Rev. 
Biochem. 67:229; Martin, 2000, Annu. Rev. Cell Dev. Bio. 14:231; Wera et al., 2001, 
FEMS Yeast Res. 1406:1). 

To identify Pl-binding proteins in yeast, liposomes were used as probes because the 
liposome provides the most relevant physiological condition to assay the interactors. Six 

20 types of liposomes were used. Each contains phosphatidylcholine ("PC") with 1% (w/w) 
N-(biotinoyl)- 1 ,2-dihexadecanoyl-sn-glycero-3 -phosphoethanolamine, triethylammonium 
salt (biotin DHPE). The biotinylated lipid serves as a label that can be detected by 
Cy3-streptavidin (21). In addition to PC, the five other liposomes contain either 5% (w/w) 
PI(3)P, PI(4)P, PI(3,4)P 2 , PI(4,5)P 2 , or PI(3,4,5)P 3 (FIG. 3A). Each of these phospholipids 

25 has been found in yeast except PI(3,4,5)P3 (Odorizzi et al., 2000, TIBS 25:229; Fruman et 
al., 1998, Annu. Rev. Biochem. 67:481; Martin, 2000, Annu. Rev. Cell Dev. Bio. 14:231; 
Wera et al., 2001, FEMS Yeast Res. 1406:1). 

The protein-liposome interaction was assayed as follows. Appropriate amounts of 
each lipid in chloroform were mixed and dried under nitrogen. The lipid mixture was 

30 resuspended in TBS buffer by vortexing. The liposomes were created by sonication. 

Blocked proteome chips were washed three to five times in PBS buffer, and the extra liquid 
on the glass surface was removed by tapping the slides vertically on a KIMWIPE™. 200 jul 
of liposome solution was added to the proteome chip and immediately covered by a 
hydrophobic plastic cover slip (GRACE BIO-LABS™, USA). After trapped air bubbles 

35 W ere removed, the chip was incubated in a humidity chamber at RT for one hour. To 
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remove the cover slip, the chip was immersed into large volume of PBS buffer (50 ml), so 
that the slip would float away. The chip was then moved to a second PBS bath (>50 ml) 
and washed 3X5 min with shaking at RT. After removing extra liquid on the surface of 
the chip, about 100 jal of Cy3 or Cy5 conjugated streptavidin (PIERCE™, USA; 1:2000 to 

5 1 :4000 dilution) was added to the surface and covered by a hydrophobic plastic cover slip 
(GRACE BIO-LABS™, USA). The chip was incubated for greater than 30 min in the dark 
at RT. The chip was then washed as described above. To completely remove the liquid on 
the chip, the chip was spun to dryness at 1500-2000 rpm for 5-10 min at RT. 

The six liposomes identified a total of 150 different interactors that produced signals 

1 0 significantly higher than the background. An algorithm was devised to assist in the 

identification of positive signals (42). Fifty-two (35%) of the interactors (i.e., lipid-binding 
proteins) correspond to uncharacterized proteins thus ascribing the first biochemical 
activities to those proteins. These results also indicate that many previously uncharacterized 
proteins have potentially important biochemical activities. 

15 Of the 52 uncharacterized proteins, thirteen (25%) are predicted to be associated 

with membranes (Gerstein, 1998, Proteins 33:518). Many others contain basic stretches, 
which can mediate electrostatic interactions with the negatively charged 
phosphatidylinositols. 

Of the 98 previously described interactors, the subcellular localization of 81 of these 

20 has been investigated (Costanzo et al., 2001, Nucleic Acids Res. 29:75). Most are either 
membrane-associated proteins or are involved in lipid metabolism. More specifically, 45 
proteins are membrane-associated and either have known or predicted membrane-spanning 
regions (Gerstein, 1998, Proteins 33:518). Among these identified interactors are integral 
membrane proteins, proteins with lipid modifications (e.g., the glycosylphosphatidylinositol 

25 (GPI) anchor proteins Tos6p and Sps2p (Costanzo et al., 2001, Nucleic Acids Res. 29:75), 
the prenylated proteins, Gpa2p and mating pheromone a-factor (Ansari et al., 1999, J. Bio. 
Chem. 274:30052), and peripherally-associated proteins (e.g., Kcc4p and Myo4p which are 
found at the bud neck and/or cell periphery (Bohl, 2000, EMBO J. 19:5514; Bertrand et al., 
1998, Mol. Cell 2:437)). Eight others are involved in lipid metabolism (e.g., Bpllp), 

30 inositol ring phosphorylation (e.g., Kcslp), or predicted to be involved in membrane/lipid 
function (e.g., Ylr020cp which has homology to triacylglycerol lipase). 

The interactors (i.e., phospholipid-binding proteins) were grouped in several ways 
(42). First, the interactors were sorted according to whether they bound lipids strongly or 
weakly, based on the phospholipid-binding signal relative to the amount of GST (FIG. 4). 

35 Slightly more (72%) of the strong lipid binding proteins (FIGS. 4A and 4B) were 
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characterized relative to the weakly binding proteins (54%) (FIGS. 4C and 4D). Many of 
the strong lipid-binding proteins are membrane-associated, or are predicted to be associated 
with membranes, whereas fewer of the weaker binding proteins appear to be associated with 
membranes (FIG. 4, "Membrane" column). Surprisingly, nineteen of the lipid-binding 

5 proteins are kinases, and seventeen of these are protein kinases. Moreover, thirteen of the 
seventeen protein kinases bind very strongly to the phospholipids. 

The proteins were also grouped by whether they preferentially bound one or more 
phosphatidylinositols as compared with phosphatidylcholine. One-hundred-and-one proteins 
bound to phosphatidylcholine as well or nearly as well as to the phosphatidylinositol lipids 

1 0 tested (PI/PC < 1 .3) (FIGS. 4B and 4D). However, 49 proteins bound to one or more 
phosphatidylinositols preferentially (PI/PC > 1.3) (FIGS. 4A and 4C). Analysis of the 
strong interactors (i.e., phosphatidylinositol-binding proteins) revealed that many 
specifically bound particular phosphatidylinositol lipids. For example, Stp22p, which is 
required for vacuolar targeting of temperature-sensitive plasma membrane proteins, such as 

15 Ste2p and Canlp, preferentially binds PI(4)P (Li et al., 1999, Mol. Cell Biol. 19:3588). 
Nine protein kinases specifically bind PI(4)P and PI(3,4)P2 strongly and one binds these 
lipids weakly. Atplp, the alpha subunit of Fl-ATP synthase of the inner membrane of 
mitochondria, also preferentially binds PI(3,4)P2 (Arnold et al., 1999, J Biol. Chem). 
Sps2p, which is involved in middle/late stage of sporulation and is localized to the prospore 

20 membrane (Chu et al, 1998, Science 282: 699), preferentially interacts with PI(3)P. One 
interesting example is Myo4p which preferentially binds PI(4,5)P2; perhaps binding of this 
lipid is an important part of its interaction at the cell cortex and/or its regulation. No strong 
lipid binding targets were found that specifically bound PI(3,4,5)P3 although some proteins 
bound both this lipid and others (FIG. 4). These results demonstrate that many 

25 membrane-associated proteins including integral membrane proteins and peripherally 
associated proteins preferentially bind specific phospholipids in vivo. 

Unexpectedly, many proteins involved in glucose metabolism, including regulators 
of glucose metabolism bind phospholipids. Specifically, three glucose metabolism 
enzymes, i.e. 9 phosphoglycerate mutase (Gpm3p), enolase (Eno2p) and pyruvate kinase 

30 (Cdcl9p/Pyklp) which participate in sequential steps in glycolysis, were identified. 
Hexokinase (Hxklp), which converts glucose to glucose-6-phosphate, and two protein 
kinases known to be regulators of glucose metabolism (i.e., Snflp and Riml5p) were also 
identified as lipid-binding interactors. Hxklp has recently been shown to bind zwitterrion 
micelles which stimulate its activity (30); Eno2p has been shown to be secreted suggesting 

35 that it might also interact with membranes (31). Accordingly, in one embodiment, 
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phospholipids can be used to regulate steps involved in glucose metabolism. In another 
embodiment, glycolytic steps can be enhanced by conducting steps of glucose metabolism 
on phospholipid scaffolds such as, but not limited to, cell membranes. 

Surprisingly, the probes comprising phospholipids recognized many proteins not 

5 expected to be involved in membrane function or lipid signaling. Therefore, six proteins, 
i.e., RimlSp, Eno2p, Hxklp, Spslp, Ygl059wp, and Gcn2p, were further tested for 
phosphoinositol binding using two types of standard assays (Williamson, 2000, Nat. Struct 
Biol. 7(10):834-7). For RimlSp, Eno2p, and Hxklp, PI(4,5)P2 liposomes were first 
adhered to a nitrocellulose membrane, which was then blocked by BSA; different amounts 

10 of the GST fusion proteins and a GST control were used to probe the membrane, and bound 
proteins were detected using anti-GST antibodies. As shown in FIG. 5 A, each yeast fusion 
protein tightly bound PI(4,5)P2 and exhibited a dosage effect; GST alone did not. We also 
carried out the reverse assay for four GST fusion proteins, RimlSp, Spslp, Ygl059wp, and 
Gcn2p (Guerra et al., 2000, Biosci. Rep. 20: 41). Different amounts of these purified 

1 5 proteins were spotted onto nitrocellulose filters and probed with the six different liposomes 
(FIG. 5B). The bound liposomes were detected using a horseradish peroxidase 
("HRP")-conjugated streptavidin. As with experiments using the proteome chips, liposomes 
bound to each protein, and did not bind the BSA control. Spslp bound all five 
phosphoinositol- containing liposomes nearly equally. RimlSp, Gcn2p, and Ygl059wp 

20 exhibited different affinities to different liposomes {see FIG. SB for Riml5p). All three 
proteins bound strongest to PI(3)P and PI(4)P and PI(3,4)P2. For Riml5p, a linear 
correlation between the binding signal and the level of RimlSp was revealed (FIG. 5C). 

Further, several proteins with unknown functions also showed strong lipid-binding 
activities, indicating that they are likely to bind phospholipids in cells. Routine assays 

25 known to those skilled in the art can be used to quickly screen lipid-binding interactors that 
are identified with a proteome chip to determine lipid-binding activity in vivo. Additionally, 
in accordance with the invention, a proteome chip can be used to determine if these 
interactors are involved in phospholipid metabolic pathways or signal transduction 
pathways. 

30 In summary, these results unequivocally demonstrate that many lipid-binding 

proteins, including proteins not previously known to bind lipids, can be detected and 
characterized using a proteome chip of the invention. Moreover, phospholipid-binding 
interactors identified using the proteome chips of the invention can be shown to be bona 
fide lipid-binding proteins in conventional assays. 

35 
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Moreover, because such interactions cannot be studied by using either the yeast 
two-hybrid system or DNA/oligo microarray technology, the proteome chips of the 
invention and the methods of using them represent a pioneering approach to exploring 
molecule-protein interactions. 

5 These studies indicate that it is feasible to prepare and screen a proteome microarray 

for a eucaryote. The novel and unobvious combination of several technological features 
were critical for practicing the present invention. First, the proteins analyzed in this study 
were prepared from yeast expression clones that have been verified by DNA sequencing and 
contain pure plasmids. Second, it was essential to produce defined clones and proteins in a 

1 0 high throughput fashion. The procedures described herein provide enough protein for 5,000 
protein chips. Third, the proteins were produced in a eucaryotic host. As such, large 
numbers of full-length proteins that are properly folded, posttranslationally modified and/or 
complexed with their native partners can be produced. As demonstrated by immunoblot 
analysis, at least 80% of the purified fusion proteins are expressed as full-length proteins. 

1 5 Many interactors can bind to probes through associated proteins, and not directly. In 

some cases, the interactor can be part of a multimer. However, such associated proteins can 
also be detected and identified using the methods of the present invention. Nevertheless, 
the interactors detected by the methods of the invention likely directly interact with, or at 
least are tightly associated with a protein interacting with, a probe. Proteins of the proteome 

20 chips of the invention were prepared using stringent conditions (e.g. , washed with 0.5 M 
NaCl), and Coomasie staining revealed only those bands detectable by anti-GST antibodies 
indicating that contamination with other proteins was minimal. 

The collection of proteins assembled is likely to underrepresent secreted proteins 
with properly folded extracellular domains. A GST tag and a HisX6 tag was fused 

25 upstream of the translational initiation codon, such that membranous proteins having a 
signal peptide may not be delivered to the secretory pathway and may not be folded or 
modified appropriately. Three proteins having signal peptides were identified in the screens 
for lipid-binding proteins, suggesting that at least some secreted proteins are produced and 
contain functional domains. 

30 Further, because not all proteins are readily overproduced and purified using the 

high-throughput methods practiced, not all interactions were detected. The protein arrays 
used are estimated to contain approximately 80% of the full-length yeast proteins at 
reasonable levels for screening, however, and thus most protein interactions can be expected 
to be detected. 
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These results demonstrate that a proteome chip can be screened for biochemical 
activities, thereby allowing global proteome analysis. Similar procedures can be used to 
prepare protein arrays of 10-100,000 proteins for global high-throughput proteome analysis 
in humans and other eukaryotes. 

5 
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7. REFERENCES CITED 

All references cited herein are incorporated herein by reference in their entirety and 
for all purposes to the same extent as if each individual publication or patent or patent 
5 application was specifically and individually indicated to be incorporated by reference in its 
entirety for all purposes. 

8. EQUIVALENTS 

1 0 Many modifications and variations of this invention can be made without departing 

from its spirit and scope. A person of ordinary skill in the art will recognize, or be able to 
ascertain through routine experimentation, various alternatives, adaptations, and 
modifications to the particular embodiments of the invention described herein, all of which 
are within the scope of the invention. Accordingly, the claimed invention intends to 

1 5 encompass all such equivalents. Thus, the specific embodiments described herein are 
offered by way of example only, and the invention is to be limited only by the terms of the 
appended claims, along with the full scope of equivalents to which such claims are entitled. 

20 
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We claim: 



10 



1 . A positionally addressable array comprising a plurality of proteins, with each 
protein being at a different position on a solid support, wherein the plurality 
of proteins comprises at least one protein encoded by at least 50% of the 
known genes in a single species. 

2. The array of Claim 1, wherein the plurality of proteins comprises at least one 
protein encoded by at least 70% of the known genes in a single species. 



3. A positionally addressable array comprising a plurality of proteins, with each 
protein being at a different position on a solid support, wherein the plurality 
of proteins comprises at least 50% of all proteins expressed in a single 
species, wherein protein isoforms and splice variants are counted as a single 

15 protein. 

4. A positionally addressable array comprising a plurality of proteins, with each 
protein being at a different position on a solid support, wherein the plurality 
of proteins comprises at least 1000 proteins expressed in a single species. 

20 

5. A positionally addressable array comprising a plurality of proteins, with each 
protein being at a different position on a solid support, wherein the plurality 
of proteins in aggregate comprise proteins encoded by at least 1000 different 
known genes in a single species. 

25 

6. The array of Claim 1, 3, 4, or 5, wherein the proteins are organized on the 
array according to a classification of proteins. 

7. The array of Claim 6, wherein the classification is by abundance, function, 
30 enzymatic activity, homology, protein family, association with a particular 

metabolic pathway, or posttranslational modification. 

8. The array of Claim 1, 3, 4, or 5, wherein the proteins are attached to the solid 
support via a His tag. 

35 
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9. The array of Claim 1,3,4, or 5, wherein the solid support comprises nickel. 

10. The array of Claim 1, 3, 4, or 5, wherein the solid support comprises a 
nickel-coated slide. 

5 

11. A method for making a positionally addressable array comprising the step of 
attaching a plurality of proteins to a surface of a solid support, with each 
protein being at a different position on the solid support, wherein the 
plurality of proteins comprises at least one protein encoded by at least 50% 

10 of the known genes in a single species. 

12. A method for making a positionally addressable array comprising the step of 
attaching a plurality of proteins to a surface of a solid support, with each 
protein being at a different position on the solid support, wherein the 

15 plurality of proteins comprises at least 50% of all proteins expressed in a 

single species, wherein protein isoforms and splice variants are counted as a 
single protein. 

13 . A method for making a positionally addressable array comprising the step of 
20 attaching a plurality of proteins to a surface of a solid support, with each 

protein being at a different position on the solid support, wherein the 
plurality of proteins comprises at least 1000 proteins expressed in a single 
species. 

25 14. A method for making a positionally addressable array comprising the step of 

attaching a plurality of proteins to a surface of a solid support, with each 
protein being at a different position on the solid support, wherein the 
plurality of proteins in aggregate comprise proteins encoded by at least 1000 
different known genes in a single species. 

30 

15. A method for making a positionally addressable array comprising the step of 
attaching a plurality of fusion proteins to a surface of a solid support, with 
each fusion protein being at a different position on the solid support, wherein 
the fusion protein comprises a first tag, a second tag, and a protein sequence 
35 encoded by genomic nucleic acid of an organism. 
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16. The method of Claim 15, wherein prior to said attaching step is a step of 
purifying the protein by contacting the protein with the binding partner of the 
said first tag, and wherein the second tag is used in said attaching step to 

5 attach the protein to the solid support. 

17. The method of Claim 16, wherein the first tag is a GST tag and the second 
tag is a His tag. 

10 18. The method of Claim 1 5, wherein the first tag and the second tag are found 

at the ammo-terminal end of the protein. 

19. The method of Claim 15, wherein the first tag and the second tag are found 
at the carboxy-terminal end of the protein. 

15 

20. A method for making and isolating a plurality of purified protein samples, 
comprising the steps of; 

at each site of a plurality of sites of a multi-site array: 

(a) growing a eukaryotic cell having a heterologous nucleotide sequence 
20 operatively linked to a regulatory sequence; 

(b) contacting the regulatory sequence with an inducer that enhances 
expression of a protein encoded by the heterologous nucleotide sequence; 

(c) lysing the cell to produce a cell lysate; 

(d) contacting the cell lysate or protein-containing sample therefrom with a 
25 binding agent such that a complex between said protein and binding agent is 

formed; and 

(e) isolating the protein from the complex; 

wherein each step is conducted in a multi-array format. 

30 21 . The method of Claim 20, wherein each site is a well. 

22. The method of Claim 20, wherein said protein is a fusion protein comprising 
an affinity tag to which said binding agent binds. 

35 23. The method of Claim 20, wherein the cell is a yeast cell. 
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24. The method of Claim 20, wherein said lysing step is performed using a paint 
shaker. 

5 25. A method for detecting a lipid-binding protein comprising the steps of: 

(a) contacting a probe comprising a lipid with a positionally addressable 
array comprising a plurality of proteins, with each protein being at a different 
position on a solid support; and 

(b) detecting any protein-probe interaction, wherein detection of the 

10 interaction at a position on the solid support indicates the presence of a 

lipid-binding protein at said position. 

26. The method of Claim 25, wherein the lipid is a phospholipid. 

15 27. The method of Claim 26, wherein the phospholipid is phosphatidylcholine or 

phosphatidylinositol. 

28. The method of Claim 25, wherein the probe comprises a liposome. 

20 29. A method for detecting a binding protein comprising the steps of: 

(a) contacting a probe with a positionally addressable array comprising a 
plurality of proteins, with each protein being at a different position on a solid 
support, wherein the plurality of proteins comprises at least one protein 
encoded by at least 50% of the known genes in a single species; and 

25 (b) detecting any protein-probe interaction. 

30. A method for detecting a binding protein comprising the steps of: 

(a) contacting a probe with a positionally addressable array comprising a 
plurality of proteins, with each protein being at a different position on a solid 

30 support, wherein the plurality of proteins comprises at least 50% of all 

proteins expressed in a single species, wherein protein isoforms and splice 
variants are counted as a single protein; and 

(b) detecting any protein-probe interaction. 

35 31. A method for detecting a binding protein comprising the steps of: 
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(a) contacting a probe with a positionally addressable array comprising a 
plurality of proteins, with each protein being at a different position on a solid 
support, wherein the plurality of proteins comprises at least 1000 proteins 
expressed in a single species; and 

(b) detecting any protein-probe interaction. 

A method for detecting a binding protein comprising the steps of: 

(a) contacting a probe with a positionally addressable array comprising a 
plurality of proteins, with each protein being at a different position on a solid 
support, wherein the plurality of proteins in aggregate comprise proteins 
encoded by at least 1000 different known genes in a single species; and 

(b) detecting any protein-probe interaction. 

A method for detecting a binding protein comprising the steps of: 

(a) contacting a probe with a positionally addressable array comprising a 
plurality of fusion proteins, with each fusion protein being at a different 
position on a solid support, wherein the fusion protein comprises a first tag, a 
second tag, and a protein sequence encoded by genomic nucleic acid of an 
organism; and 

(b) detecting any protein-probe interaction. 

The method of Claim 29, 30, 31, 32, or 33, wherein the probe comprises a 
nucleic acid, protein, small molecule, drug candidate or lipid. 

The method of Claim 34, wherein the nucleic acid comprises RNA or DNA. 

The method of Claim 34, wherein the probe is a yeast protein. 

The method of Claim 36, wherein the yeast protein is Myo2, Rhol, Rho2, 
Rho3, Rho4, Cdcll, Cdcl2, or Hsl7. 

The method of Claim 34, wherein the probe is an antibody. 
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The method of Claim 38, wherein the antibody is directed against cyclin, 
kinase, GST, Clb5, Cla4, Ste20, Cdc42, PI(3,4)P2, PI(4)P, SPA2, CLB1, 
CLB2, or Cdcll. 

The method of Claim 34, wherein the probe is calmodulin. 

The method of Claim 34, wherein the probe comprises a small molecule 
selected from the group consisting of ATP, GTP, cAMP, phosphotyrosine, 
phosphoserine, and phosphothreonine. 

The method of Claim 34, wherein the probe comprises phosphatidylcholine 
or phosphatidylinositol. 

The method of Claim 34, wherein the probe comprises a liposome. 

The method of Claim 25, 29, 30, 31, 32, or 33, wherein the probe is from a 
mammal. 

The method of Claim 44, wherein the mammal is human. 

The method of Claim 44, wherein the plurality of proteins is non-human. 

The method of Claim 25, 29, 30, 31, 32, or 33, wherein the plurality of 
proteins is attached to the solid support via a His tag. 

The method of Claim 25, 29, 30, 31, 32, or 33, wherein the solid support 
comprises nickel. 

The method of Claim 25, 29, 30, 31, 32, or 33, wherein the solid support 
comprises a nickel-coated slide. 

The method of Claim 34, further comprising the step of determining the 
identity of a probe whose interaction with a protein is detected in said 
detecting step. 
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5 1 . The method of Claim 50, wherein said interaction indicates that said 
identified probe is an antibacterial, antifungal, or antiviral protein. 

52. A method of labeling a protein for use in a binding assay, comprising the 
5 steps of: 

(a) contacting separate aliquots of said protein with a biotin-transferring 
compound under conditions and for a period of time to produce said proteins 
that are biotinylated to differing degrees among the different aliquots; and 

(b) combining together said different aliquots to produce a sample of 
1 0 differentially biotinylated protein. 

53. A method for detecting a binding protein comprising the steps of: 

(a) contacting a sample of biotinylated protein produced by the method of 
Claim 52 with a positionally addressable array comprising a plurality of 

1 5 proteins, with each protein being at a different position on a solid support; 

and 

(b) detecting any positions on the array, wherein interaction between a 
biotinylated protein and a protein on the array occurs. 

20 54. A method for detecting a binding protein comprising the steps of: 

(a) contacting a sample of biotinylated protein produced by the method of 
Claim 52 with a positionally addressable array comprising a plurality of 
proteins, with each protein being at a different position on a solid support; 

(b) contacting said array with streptavidin conjugated to a fluor; and 
25 (c) detecting any positions on the array at which fluorescence occurs, 

wherein said fluorescence indicates that interaction between a biotinylated 
protein and a protein on the array occurs. 

55. A method for determining whether a protein preferentially binds 
30 phosphatidylinositol as compared with phosphatidylcholine, comprising the 

steps of: 

(a) contacting a probe comprising phosphatidylinositol with a positionally 
addressable array comprising a plurality of proteins, with each protein being 
at a different position on a solid support; 

35 



75 



WO 02/092118 



PCT/US02/14982 



(b) detecting protein-probe interaction, wherein said interaction at a position 
on the solid support indicates the presence of a phosphatidylinositol-binding 
protein; 

(c) contacting a probe comprising phosphatidylcholine with a positionally 

5 addressable array comprising a plurality of proteins, said proteins comprising 

at least some of the same proteins as in step (a) 5 with each protein being at a 
different position on a solid support; 

(d) detecting protein-probe interaction, wherein the interaction at a position 
on the solid support indicates the presence of a phosphatidylcholine-binding 

10 protein; and 

(e) comparing, for each of a plurality of proteins, the results of steps (b) and 

(d>. 

56. A method for determining if a phospholipid regulates a metabolic pathway or 
1 5 signal transduction pathway in a cell, or if said metabolic or signal 

transduction pathway occurs on membrane surfaces, comprising the steps of: 

(a) contacting a probe comprising phospholipid with a positionally 
addressable array comprising a plurality of proteins, with each protein being 
at a different position on a solid support, wherein the plurality of proteins 

20 comprise one or more proteins that form at least part of said pathway; and 

(b) detecting interaction of said probe with a protein in said pathway; 
wherein said interaction indicates that said probe regulates said metabolic 
pathway or signal transduction pathway, or that said pathway occurs on 
membrane surfaces. 

25 

57. A method for making a non-naturally occurring protein that binds 
calmodulin comprising making a non-naturally occurring protein comprising 
the following sequence: 

I/L-Q-X-X-K-K7X-G-B (SEQIDNO: 1), 
30 wherein X is any amino acid and B is a basic amino acid. 

58. A method for determining the presence or absence of a posttranslational 
modification in a protein comprising the steps of: 

35 
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(a) contacting a probe that binds to said posttranslational modification with a 
positionally addressable array comprising a plurality of proteins, with each 
protein being at a different position on a solid support; and 

(b) detecting any interaction of said probe with a protein; wherein said 

5 interaction at a position on the solid support indicates that the protein at said 

position has said posttranslational modification. 

59. The method of Claim 58, wherein said posttranslational modification is 
methylation, phosphorylation, biotinylation, acetylation, pegylation, 

10 glycosylation, lipid modification, ubiquitination, and sumolation. 

60. A method for preparing a culture of yeast cells, comprising the steps of: 

(a) growing a plurality of yeast cells in a growth medium until the OD 600 is 
between 0.3 and 1.0, wherein said plurality of yeast cells comprises a 

15 heterologous nucleotide sequence operatively linked to a regulatory 

sequence, 

(b) contacting said cell with an inducer that enhances expression of a protein 
encoded by said heterologous nucleotide sequence; 

(c) separating said cells from said medium; 
20 (d) contacting said cells with cold water; 

(e) separating said cells from said cold water; 

(f) contacting said cells with cold lysis buffer; 

(g) separating said cells from said lysis buffer; and 

(h) freezing said cells semi-dry for storage. 

25 

61 . A method for purifying a protein from a cell, comprising the steps of: 

(a) for each of a plurality of cell samples, lysing cells in each of said sample 
to produce a cell lysate, wherein said cell comprises a fusion protein having 
an affinity tag, and wherein said lysing step is performed using a paint 

30 shaker; 

(b) separating each said lysate into a soluble fraction and a non-soluble 
fraction; 

(c) transferring each said soluble fraction into a different site of a multi-site 
array, wherein said transferring step is performed using a wide-open tip; 

35 
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(d) contacting each said soluble fraction with a binding agent such that a 
complex between said fusion protein and binding agent is formed; 

(e) isolating each said fusion protein from the complex; and 

(f) storing each said fusion protein in a buffer of high viscosity. 

5 

62. A method for identifying whether a signal is positive, comprising the steps 
of: 

(a) determining foreground and background signals for each spot locally and 
determining net signals from the difference between said foreground and 

10 background signals; 

(b) determining the lower quartile, median, and upper quartile values of a 
first and second net signal distribution; 

(c) subtracting a first median value from said first net signal distribution, and 
subtracting a second median value from said second net signal distribution to 

1 5 obtain a first and second subtracted value, respectively; 

(d) dividing said first subtracted value by the difference between said upper 
and lower quartile values of said first signal distribution, and dividing said 
second scaled value by the difference between said upper and lower quartile 
values to obtain a first and second scaled value, respectively; 

20 (e) determining a local median value of a scaled signal distribution of a 

neighborhood region, wherein said neighborhood region comprises a 
plurality of sites in the area; and 

(f) subtracting the local median value from the scaled signal to obtain a 
scaled excess value. 

25 

63 . The method of Claim 62, wherein a positive signal indicates protein-probe 
interaction. 

64. The method of Claim 62, wherein the neighborhood region is two rows 

30 above, two rows below, two columns to the left, and two columns to the right 

of the signal. 

65. The method of Claim 62, further comprising the step of excluding parallel 
samples of scaled excess values if the difference between one of the sample 

35 
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values and the average of the sample values is greater than three standard 
deviations of the error of the scaled excess value. 

A method for identifying positive signals among signals measured with a 
plurality of different arrays, comprising the steps of: 

(a) transforming signals measured with different arrays to generate 
transformed signals; 

(b) correcting each said transformed signal by a method comprising 
subtracting from said transformed signal a local median signal to generate a 
corrected transformed signal, wherein said local median signal is the median 
of signals in a neighborhood region, said neighborhood region comprising 
one or more sites around site of said transformed signal; and 

(c) comparing said each said corrected transformed signal to a threshold 
value, and identifying said corrected transformed signal as positive if said 
corrected transformed signal is greater than said threshold value; 

wherein said array comprises a positionally addressable array 
comprising a plurality of proteins, with each protein being at a 
different position on a solid support. 

The method of claim 66, wherein said step of transforming comprises the 
steps of: 

(a) determining for signals measured with each of said different arrays the 
lower quartile value, the median value, and the upper quartile value of the 
signal distribution; 

(b) subtracting from signals measured with each of said different arrays said 
median value to obtain translated signals for said array; and 

(c) dividing said translated signals by the difference between said upper and 
lower quartile values of said array, thereby generating said transformed 
signals. 

The method of claim 66, wherein said neighborhood region consists of sites 
within an area two rows above, two rows below, two columns to the left, and 
two columns to the right of the transformed signal. 



79 



WO 02/092118 



PCT/US02/14982 



69. The method of Claim 66, 61, or 68, wherein said positive signal indicates 
protein-probe interaction. 

70. The method of Claim 66, 67, or 68 ? further comprising the step of discarding 
5 data points, wherein said data points are measured at duplicate sites on the 

array; and wherein the variation between said duplicate sites is greater than 
three standard deviations. 

71 . The method of Claim 66 5 67, or 68, further comprising the step of 
10 normalizing the corrected transformed signals using the formula: 

r + Sr = G + sg I R + Si? 

wherein G is a corrected transformed signal, R is a GST signal, sg is the error 
of G, sr is the error of R, and Br is the error of r. 

* 5 72. The method of Claim 20, wherein the multi-site array is a 96-site array. 

73 . A method for making a positionally addressable array comprising the step of 
attaching a plurality of fusion proteins to a surface of a solid support, with 
each fusion protein being at a different position on the solid support, wherein 

20 the fusion protein comprises a first tag, a second tag, and a protein sequence 

encoded by genomic nucleic acid of an organism. 

74. The method of Claim 62 or 65, further comprising the steps of: 

(a) averaging the values of the signals of two duplicate spots to obtain an 
^ average value; and 

(b) determining whether said average value is greater than three standard 
deviations of the error of the scaled excess value, 

wherein a signal of said spot is positive if said average value is greater than 
three standard deviations of the error of the scaled excess value. 

30 

75. A method for determining the presence or absence of an enzymatic activity 
in a protein comprising the steps of: 

(a) contacting a probe that is a substrate for said enzymatic activity with a 
positionally addressable array comprising a plurality of proteins, with each 
35 protein being at a different position on a solid support; and 
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(b) detecting any catalysis of said substrate at a position on the solid support; 
wherein said catalysis at a position on the solid support indicates that the protein at 
said position has said enzymatic activity. 

5 76. A method for determining the presence or absence of an enzyme substrate in 

a protein comprising the steps of: 

(a) contacting a probe that is an enzyme for said enzyme substrate with a 
positionally addressable array comprising a plurality of proteins, with each 
protein being at a different position on a solid support; and 
10 (b) detecting any catalysis of said substrate at a position on the solid support; 

wherein said catalysis at a position on the solid support indicates that the 
protein comprises said enzyme substrate. 

77. The array of Claim 1 5 3, 4, or 5, wherein the proteins are attached to the solid 
1 5 support via a biotin tag. 

78. The method of Claim 1 5, wherein the first tag is found at the carboxy- 
terminal end of the protein and the second tag is found at the amino-terminal 
end of the protein. 



20 



79. The method of Claim 22, wherein the affinity tag is biotin. 

80. The array of Claim 1, 3 5 4, or 5, wherein the solid support comprises a 
nitrocellulose-coated slide. 



25 
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SEQUENCE LISTING 

<110> Yale University 

<120> GLOBAL ANALYSIS OF PROTEIN ACTIVITIES 
USING PROTEOME CHIPS 

<130> 6523-035-228 



<140> To be assigned 
<141> Herewith 

<150> 60/290,583 
<151> 2001-05-11 



<150> 60/308,149 
<151> 2001-07-26 



<160> 16 



<170> FastSEQ for Windows Version 4.0 



<210> 1 
<211> 7 
<212> PRT 

<213> Artificial Sequence 



<220> 

<223> consensus sequence for calmodulin-binding 
protein 



<221> VARIANT 
<222> 1 

<223> Xaa lie or Leu 



<221> VARIANT 
<222> 3 

<223> Xaa - any amino acid 

<221> VARIANT 
<222> 5 

<223> Xaa — Lys or any amino acid 

<221> VARIANT 
<222> 7 

<223> Xaa = basic amino acid 



<400> 1 

Xaa Gin Xaa Lys Xaa Gly Xaa 
1 5 



<210> 2 
<211> 17 
<212> PRT 

<213> Saccharomyces cerevisiae 



<400> 2 

Leu Lys Glu Thr Leu Gin Ser Val Lys Ser Leu Lys Asp Ala Leu Asn 

1 5 10 15 

Asp 
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<210> 3 
<211> 17 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 3 

His Ser Val Asp Leu Gin Ser Ser Lys Phe Gin Leu Ala lie Val Cys 

1 5 10 15 

Thr 



<210> 4 
<211> 17 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 4 

Asp Glu His Phe He Gin Arg Leu Pro Ser Thr Arg Leu Asn Ser Thr 

15 10 15 

Asp 



<210> 5 
<211> 17 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 5 

Ala Lys He Pro Leu Gin Arg Leu Gly Ser Thr Arg Asp He Ala Glu 

15 10 15 

Ser 



<210> 6 
<211> 17 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 6 

Asp Asp Leu Arg Leu Gin Ser Gin Lys Lys Gly Gly Glu Leu Thr Glu 

1 " = 5 10 15 

Glu 



<210> 7 
<211> 17 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 7 

Leu Asn Pro He He Gin Asp Thr Lys Lys Gly Lys Leu Arg Phe Val 

15 10 15 

Arg 
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<210> 8 
<211> 13 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 8 

Arg Lys Ala Leu lie Gin Arg Lys Gly Gly Lys Leu Glu 
15 10 



<210> 9 
<211> 17 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 9 

Val Val Asp Pro lie Gin Ser Val Lys Gly Lys Val Val lie Asp Ala 

15 10 15 

Phe 



<210> 10 
<211> 16 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 10 

Gly His Pro lie lie Gin Asp Lys Glu Gly Asn Gly Val Leu lie Cys 
1 5 ~ " 10 ~ 15 



<210> 11 
<211> 17 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 11 

Arg Val Trp Gly lie Gin Pro Val Asn Lys Lys Phe Asn Ala Arg Ser 

15 10 15 

Ala 



<210> 12 
<211> 17 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 12 

Leu Leu Arg Glu lie Gin Ser Lys Arg Ser Lys Lys Asp Glu Glu Gly 

1 5 10 ~ 15 

Lys 



<210> 13 
<211> 17 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 13 
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Leu Asn Met Lys lie Gin Lys Leu Arg Asp Leu Tyr Leu Glu Gin Thr 

15 10 15 

Glu 



<210> 14 
<211> 16 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 14 

Asp Leu Phe Gly He Gin His Cys His Asn He Asp Val Lys Arg Leu 
1 5 10 ' 15 



<210> 15 
<211> 17 
<212> PRT 

<213> Saccharomyces cerevisiae 
<400> 15 

Asn Gly Leu Leu lie Gin Ser Ser Lys Phe He Ser Lys Val Leu Leu 

15 10 15 

Thr 



<210> 16 
<211> 11 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> motif found in many calmodulin-binding proteins 
<221> VARIANT 

<222> 3, 4 f 5, 6, 8, 9, 10 

<223> Xaa = Any Amino Acid 

<400> 16 

lie Gin Xaa Xaa Xaa Xaa Lys Xaa Xaa Xaa Arg 
15 10 
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